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Abstract 


Through  the  use  of  the  principle  of  minimum  discrimination 
information  estimation,  leading  to  exponential  families  or 
multiplicative  models  or  log-linear  models  it  is  shown,  using 
illustrative  examples  exhibiting  different  aspects  of 
contingency  table  analysis,  that: 

(1)  Estimates  of  the  cell  entries  under  various  hypotheses 
or  models  can  be  obtained; 

(2)  The  adequacy  or  fit  of  the  model,  or  the  null 
hypothesis,  can  be  tested; 

(3)  Main  effect  and  interaction  parameters  can  be 
estimated; 

(4)  The  structure  of  the  table  can  be  studied  in  detail 
in  terms  of  the  various  interrelationships  among  the 
classificatory  variables; 

(5)  The  procedures  can  be  applied  to  test  hypotheses 
about  particular  parameters  and  linear  combinations 
of  parameters  that  are  of  special  interest; 

(6)  The  procedures  provide  indication  of  outlier  cells; 

(7)  Since  the  procedures  and  concepts  are  based  on  a 
general  principle  a  unified  treatment  of  multi¬ 
dimensional  contingency  tables  is  possible; 


(8)  The  procedure  provides  estimates  based  on  an 
observed  or  sample  table,  which  satisfy  certain 
external  hypotheses  as  to  underlying  probability 
relations  in  the  population  table.  These  estimates 
also  preserve  the  inherent  properties  of  the  observed 
data  not  affected  by  the  hypothesis; 

(9)  In  general,  the  m.d.i.  estimate  are  best  asymptotic¬ 
ally  normal; 

(10)  The  minimum  discrimination  information  test 
statistics  are  asymptotically  distributed  as 
chi-squared  with  appropriate  degrees  of  freedom; 

(11)  Convergent  iterative  computer  algorithms  are 
available  for  the  analyses. 
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1.  Introduction 


Data  which  result  from  experiments  in  the  physical  sciences  and 
engineering  are  usually  outcomes  of  controlled  experiments,  and  expressible 

I 

in  quantitative  terms.  In  many  other  fields  however,  the  data  are  seldom 
results  of  controlled  experiments.  In  addition,  the  observations  usually 
can  be  expressed  only  in  qualitative  or  categorical  terms,  a  yes  -  no, 
alive  -  dead,  agree  -  disagree,  class  A  -  class  B  -  class  C,  etc.  type  of 
response. 

For  example,  an  Individual  may  be  classified  by  sex,  by  race,  by 
profession,  by  smoking  habit,  by  age,  by  incidence  of  coronary  heart 
disease.  If  we  take  observations  over  a  sample  of  many  such  individuals, 
the  result  will  be  a  multidimensional  contingency  table  with  as  many 
dimensions  as  there  are  classifications.  Contingency  tables  are  cross- 
classifications,  of  vectors  of  discrete  random  variables  shewing  the  number 
of  subjects  belonging  to  distinct  categories  of  each  of  several  qualitative 
or  categorical  classifications .  The  number  of  counts  of  individuals  in  a 
cell  of  this  table  represents  that  portion  of  the  sample  having  the 
specific  attributes  within  each  of  the  classifications.  A  problem  of 
interest,  for  example,  might  be  to  determine  the  factors  that  are  associated 
with  the  presence  or  absence  of  coronary  heart  disease. 

Data  from  many  fields  are  often  presented  in  this  manner,  that  is, 
in  a  cross-tabulated  form.  Statistical  analyses  of  these  types  of  data  has 
had  a  long  history,  as  may  be  seen  from  the  bibliography,  but  were  mainly 
concerned  with  the  simple  kind,  the  two-way  table.  Analyses  of 
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multidimensional  contingency  tables  have  been  Investigated  Intensively 
only  during  the  last  decade  or  so. 

Conclusions  drawn  from  contingency  tables  may  be  only  exploratory  in 
nature.  One  of  the  difficulties  can  be  the  availability  of  meaningful  and 
reliable  data.  The  first  problem  one  faces  in  the  analysis  of  cross- 
classified  data  is  the  decision  on  the  number  of  classifications  to  be 
Included  and  the  categories  within  each  classification.  Typical  among  the 
problems  in  the  analysis  is  how  to  segregate  the  effect  on  the  response  of 
some  of  the  background  variables,  Individually  or  jointly,  from  that  of 
the  others  that  are  of  particular  interest.  Thfe  data  analytic  attitude  is 
empirical  rather  than  theoretical.  A  more  empirical  attitude  is  natural 
when  detailed  theoretical  understanding  is  unavailable.  Estimation  of 
parameters  in  models  should  be  considered  less  as  attempts  to  discover 
underlying  truths  and  more  as  data  calibrating  devices  which  make  it  easier 
to  conceive  of  noisy  data  in  terms  of  smooth  distributions  and  relations. 
With  a  given  data  set,  a  variety  of  models  may  be  tried  on,  and  one 
selected  on  the  ground  of  looks  and  fit.  (See  Dempster  (1971).) 

Consider,  for  example,  an  experiment  performed  to  compare  the 
effectiveness  of  safety  release  devices  for  refrigerators  in  relation  to 
children's  safety.  Children  between  two  to  five  years  of  age  are  induced 
to  crawl  into  refrigerators  equipped  with  six  different  types  of  release 
devices.  If  a  child  can  open  the  door  of  the  refrigerator,  from  inside, 
within  a  certain  time  period,  the  response  is  classified  as  a  success, 
otherwise  a  failure.  The  background  variables  studied  included  age,  sex, 
weight,  socio-economic  status  of  parents.  The  experimental  variable  was 
one  of  six  devices.  (A  partial  analysis  of  this  data  may  be  found  in 
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Ku  1  llt.'ick  ot  nl.  l%2h,  p.  Mil  )  Some  !>a  1  nnc I v.,\  of  the  background  variables 
was  achieved. 

In  other  instances  none  of  the  factors  are  subject  to  experimental 
control,  and  whatever  available  data  could  be  collected  is  reported.  The 
analysis  of  this  type  of  data,  though  it  nay  only  be  seeking  preliminary 
information  can  be  important  in  fields  of  health  and  safety.  The 
uncontrolled  experimental  data  are  sometimes  the  only  realistic  data 
available  when  these  data  deal  with  life,  death,  health,  and  safety,  and 
some  of  these  factors  and  responses  are  only  expressible  in  qualitative 
terms,  In  the  present  state  of  art. 

It  is  expected  that  the  number  of  problems  calling  for  the  techniques 
of  the  analysis  of  multidimensional  contingency  tables  will  increase. 
Experience  at  the  George  Washington  University  with  such  a  growing  demand 
confirms  this.  The  examination  and  interpretation  of  data  from  social 
phenomena,  housing,  psychology,  education,  environmental  problems,  health, 
safety,  manpower,  business,  experimental  testing  of  devices,  military 
research  and  development,  etc.,  are  potential  source  areas. 

Critics  of  methods  for  contingency  table  analysis  have  maintained 
that  most  of  the  procedures  used,  at  least  in  the  past,  were  only  of  a 
global  chi-squared  test  nature.  However,  for  a  recent  example  of  this 
see  Patil  (1974).  Through  the  use  of  the  principle  of  minimum  discrimination 
information  (m.d.i.)  estimation,  leading  to  exponential  families  or  multi¬ 
plicative  models  or  log-linear  models  we  shall  show,  using  illustrative 
examples  exhibiting  different  aspects,  that: 

(1)  Estimates  of  the  cell  entries  under  various  hypotheses  or 
models  can  he  obtained; 
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(2)  The  adequacy  or  fit  of  the  model,  or  the  null  hypothesis,  can 
be  tested: 

(3)  Main  effect  and  interaction  parameters  can  be  estimated; 

(4)  The  structure  of  the  table  can  be  studied  in  detail  in  terms  of 
the  various  interrelationships  among  the  classlficatory 
variables : 

(5)  The  procedures  can  be  applied  to  test  hypotheses  about  particular 
parameters  and  linear  combinations  of  parameters  that  are  of 
special  interest: 

(6)  The  procedures  provide  indication  of  outlier  cells.  These  may 
cause  a  model  not  to  fit  overall,  yet  fit  the  other  cells 
excluding  the  outliers; 

(7)  Since  the  procedures  and  concepts  are  based  on  a  general  principle 
a  unified  treatment  of  multidimensional  contingency  tables  is 
possible.  Sequences  of  generalizations  step  by  step  to  higher 
order  dimensional  contingency  tables  are  not  necessary  as  has 
been  the  case  with  other  ad  hoc  procedures  (see  for  example, 

Patll  (1974),  Suglura  and  Otake  (1974)); 

(8)  The  procedure  provides  estimates  based  on  an  observed  or  sample 
table,  which  satisfy  certain  external  hypotheses  as  to  under¬ 
lying  probability  relations  in  the  population  table.  These 
estimates  also  preserve  the  inherent  properties  of  the  observed 
data  not  affected  by  the  hypothesis; 

(9)  In  general,  the  m.d.l.  estimates  are  best  asymptotically  normal 
(BAN)  and  in  the  many  applications  of  fitting  models  to  a  table 
based  on  observed  sets  of  marginal  values  the  m.d.i.  estimates 
in  particular  are  maximum-likelihood  estimates; 


4 


J  - 


* 

(10)  The  test  statistics  arc  minimum  discrimination  information 
(m.d.i.)  statistics  which  arc  asymptotically  distributed  as 
chi-squared  with  appropriate  degrees  of  freedom.  In  the  case 
of  fitting  models  to  a  table  based  on  observed  sets  of  marginal 
values  the  m.d.i.  statistics  are  log-likelihood  ratio  statistics. 
The  m.d.i.  statistics  are  additive,  as  are  the  associated  degrees 
of  freedom,  so  that  the  total  under  an  hypothesis  can  be  analyzed 
into  components  each  under  a  sub-hypothesis.  The  analysis  is 
analogous  to  analysis  of  variance  and  regression  analysis 
techniques,  using  a  design  matrix,  a  set  of  regression  parameters, 
and  explanatory  variables. 

(11)  In  models  fitting  estimates  to  an  observed  table  based  on  sets  of 
observed  marginal  values  as  explanatory  variables,  some  estimates 
can  be  assed  explicitly  as  products  of  marginal  values. 

Howeve.  this  is  not  generally  true,  and  expected  cell  frequencies 
(functions  of  marginal  values),  can  be  computed  by  an  iterative 
proportional  fitting  procedure  (Ku  et  al.  (1°71)),  ar.d  the  use 

c  a  computer  to  perform  the  iterations  becomes  necessary.  For 
the  foregoing  cases  which  we  shall  term  internal,  and  problems 
involving  tests  of  external  hypotheses  on  underlying  populations 
a  number  of  iterative  computer  programs  are  available.  They 
provide  as  output,  design  matrices,  the  observed  cell  entries 
and  the  cell  estimates  as  well  as  their  logarithms,  parameter 
estimates,  outlier  values,  m.d.i.  statistics  and  their 
corresponding  significance  levels,  and  covariance  matrices  of 
parameter  estimates,  to  assist  in  and  simplify  the  numerical 
aspects  of  the  inference.  In  this  respect  it  is  of  interest  to 
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cite  the  following  quotation  from  a  book  review  by  D.  J.  Finney 
in  Journal  Royal  Statistical  Society,  Series  A  (General)  Vol. 

136  (1973),  Part  3,  p.  461,  "No  mention  is  made  of  the  extent 
to  which  computers  have  destroyed  the  need  to  assess  statistical 
methods  in  terms  of  arithmetical  simplicity:  indeed  the 
emphasis  on  avoiding  lengthy,  but  easily  programmed,  iterative 
calculations  is  remarkable." 

Classical  problems  in  the  historical  development  of  the  analysis  of 
contingency  tables  concerned  themselves  primarily  with  such  questions  as  the 
independence  or  conditional  independence  of  the  classificatory  variables, 
or  homogeneity  or  conditional  homogeneity  of  the  classificatory  variables 
over  time  or  space,  for  example,  similar  to  such  tests  in  multivariate 
analysis  as  Independence,  multiple  correlation,  partial  correlation, 
canonical  correlation,  etc.  Such  classical  problems  turn  out  to  be  special 
cases  of  the  techniques  we  shall  discuss.  (See  for  example  Kullback  et  al. 
1962a,  1962b.)  These  techniques  result  in  analyses  which  are  essentially 
regression  type  analyses.  As  such  they  enable  us  to  determine  the  relation¬ 
ship  of  one  or  more  "dependent"  qualitative  or  categorical  variables  of 
Interest  on  a  set  of  "independent"  classificatory  variables,  as  well  as  the 
relative  effects  of  changes  in  the  "independent"  variables  on  the 
"dependent  variables."  The  object  of  the  analyses  is  the  study  of  the 
interaction  between  and  among  the  classifications.  The  term  interaction  is 
used  here  in  a  general  sense  to  cover  both  dependence  and  association  (see 
for  example,  Bartlett  (1935),  Simpson  (1951),  Roy  and  Kastenbaum  (1956), 

Ku  et  al.  (1971)).  It  may  be  noted  here  that  in  a  seminar  on  a  study  of 
the  historical  development  of  the  concept  of  interaction  in  the  analysis 
of  multidimensional  contingency  tables,  the  following  series  of  papers. 
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among  the  many  that  could  be  selected,  vas  found  to  be  very  Instructive: 
Bartlett  (1935),  Lancaster  (1951),  Simpson  (1951),  Roy  and  Kastenbaum 
(1956),  Darroch  (1962),  Lewis  (1962),  Plackett  (1962,  1969),  Birch  (1963, 
1964,  1965),  Goodman  (1963b,  1970,  1971),  Good  (1963),  Kastenbaum  (1965), 
Mantel  (1966),  Berkson  (1968,  1972),  Bhapkar  and  Koch  (1968a,  1968b),  Ku 
and  Kullhack  (1968),  Dempster  (1971),  Ku,  Varner  and  Kullback  (1971).  It 
vas  pointed  out  by  Darroch  (1962),  'That  'interaction'  in  contingency  tables 
enjoys  only  n  few  of  the  fortuitously  simple  properties  of  interactions  in 
the  analysis  of  variance.  (See  Kullback,  1973.) 

Following  this  general  introduction  we  shall  consider  further  aspects 
of  contingency  tables  in  greater  expository  detail.  We  then  present  an 
introduction  to  minimum  discrimination  information  estimation,  the  log- 
linear  representation,  associated  design  matrices  and  parameters,  without 
detailed  mathematical  proofs.  This  will  enable  the  reader  then  to  study 
the  many  illustrative  examples  that  follow  and  present  various  aspects  of 
the  possible  analyses.  The  mathematical  statistical  proofs  etc.  are  to  be 
found  at  the  end  of  the  presentation. 
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2.  Contingency  Tables 

.  Description 

There  are  two  ways  in  which  statistical  data  are 
collected.  In  one  form,  actual  measurements  are  re¬ 
corded  for  each  individual  in  the  sample;  in  the  other, 
the  individuals  are  classified  as  belonging  to  different 
categories.  On  many  occasions  classifications  are 
used  to  reduce  original  data  on  direct  measurements.  A 
well-known  example  is  that  of  "frequency-distributions"* 
Data  collected  in  the  form  of  measurements  may  later  be 
grouped  and  presented  as  a  frequency  distribution. 

An  important  advantage  of  grouping  is  that  it  results 
in  a  considerable  reduction  of  data.  On  the  other  hand, 
it  is  not  usually  possible  to  convert  grouped  or 
classified  data  back  into  the  original  form. 

A  contingency  table  is  a  form  of  presentation  of 
grouped  data.  In  the  simplest  case,  a  group  of  N 
items  may  be  classified  into  just  two  groups,  according 
to,  say,  presence  or  absence  of  a  certain  characteristic. 
For  a  fixed  (given)  characteristic  the  different  groups 
of  classification  are  called  categories.  For  example, 
a  group  of  N  individuals  may  be  classified  according 
to  hair-color  (characteristic)  ,  the  categories  being 
black,  brown,  blonde  and  "other".  The  categories  may 
be  qualitative  as  above,  or  may  be  quantitative,  as 
for  example  in  the  classification  by  weight  in  pounds 
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consisting  of  fiye  categories:  40-80,  80-120,  120-160, 
160-200,  200-240.  When  there  is  only  one  characteristic 
according  to  which  data  are  classified  we  get  a  one-way- 
table.  If  there  are  two  ways  of  classification,  say 
according  to  Rows  and  Columns,  the  Row-classification 
having  r  categories  and  the  Column-classification  having 
c  categories ,  the  table  is  called  a  two-way  table  or  a 
r  x  c  table.  The  latter  notation  gives  the  number  of 
categories  in  each  classification.  Carrying  this 
notation  further,  a  r  x  c  x  d  table  will  have  three 
characteristics  o'c  classification,  the  first  having 
r  categories,  the  second  having  c  and  the  third  d. 

2 .  Examples : 

Example  1:  The  following  is  a  one-way  table  with  one 
classification-characteristic  (Geographic  Area)  and 
four  categories.  It  gives  the  distribution  of  students 
by  Geographic  Area. 


East 

North 

West 

South 

Total 

4201 

4552 

2840 

5130 

16723 

Example 

2 :  Consider  the 

distribution  of 

20 

balls 

in  six  cells 

Cell 

1 

2 

3  4 

5 

6 

Tot. 

Occupancy 

2 

4 

4  5 

1 

4 

20 

3 


It  may  be  recalled  at  this  point  that  in  many 
situations  such  a  distribution  of  N  balls  in  k  cells 
is  adequately  described  by  the  multinomial  distribution. 

We  may  therefore  expect  that  the  multinomial  distri¬ 
bution  will  have  an  important  role  to  play  in  the  analysis 
of  contingency  tables. 

Example  3:  The  distribution  of  students  by  Geographic 
area  (as  in  Ex.  L)  and  sex  gives  rise  to  the  follow¬ 
ing  2x4  contingency  table. 


Sex 

Geographic  Area 

Totals 

East 

North 

West 

South 

Male 

2201 

2550 

1400 

3160 

— sim 

Female 

2000 

2202 

1440 

2030 

7672 

Totals 

4201 

4552 

2840 

5130 

16723 

Note  that  this  is  called  a  2  x  4  table  since  the 
Row-classification  (sex)  has  2  categories.  If  the 
geographic  areas  were  written  in  rows  and  the  sex 
were  to  correspond  to  columns  we  would  get  a  4  x  2 
table.  We  will  follow  this  convention  throughout. 

t 

Observe  that  for  a  two-way  table  there  are  two 
sets  of  marginal  totals.  In  the  above  table  the  totals 
on  the  right  can  be  looked  upon  as  a  one-way  table  with 
sex  as  a  characteristic  and  two  categories,  male  and 
female.  At  the  bottom  of  the  above  table,  we  see  the 
one-way  table  of  Ex.  1.  This  shows  that  any  two- 

» 
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way  table  is  associated  with  two  one-way  tables  given 
by  the  marginal  totals  of  each  characteristic. 

Example  4.:  The  data  below  are  octane  determinations 
on  independent  samples  of  gasoline  obtained  in  two 
regions  of  the  northeastern  United  States  in  the  summer 
of  1953.  (Brownlee,  Statistical  Theory  and  Methodology, 
J.  Wiley,  1965,  p.  306). 


Region  A: 

84.0 

83.5 

84.0 

85.0 

83.1 

83.5 

81.7 

85.4 

84.1 

83.0 

85.8 

84.0 

84.2 

82.2 

83.6 

84.9 

Region  D: 

80.2 

82.9 

84.6 

84.2 

82.8 

83.0 

82.9 

83.4 

83.1 

83.5 

83.6 

86.7 

82.6 

82.4 

83.4 

82.7 

82.9 

83.7 

81.5 

81.9 

81.7 

82.5 

The  problem  of  interest  was  whether  the  variability 
in  the  octane  numbers  could  be  regarded  as  the  same 
for  the  two  regions.  Since  the  number  of  sample- 
values  for  region  A  and  0  are  small  (16  and  22  respec¬ 
tively)  the  data  can  be  conveniently  analyzed  in  the 
given  form.  For  the  sake  of  illustration,  suppose  that 
we  classify  the  octane  readings  into  three  categories; 
below  83.5  as  "poor",  between  83.5  and  84.5  as  "normal" 
and  above  84.5  as  "better" ,  we  will  get  the  following 
2x3  table: 
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Region 

Gasoline  quality 

Totals 

Poor  Normal  Better 

A 

4  8  4 

16 

D 

16  5  1 

22 

Totals 

20  13  5 

38 

This  illustrates  how  to  prepare  contingency  tables 
from  actual  measurement-data.  But  the  example  brings 
out  another  important  point.  The  contingency  table, 
in  fact,  represents  two  frequency  distributions,  one 
from  Region  A  and  the  other  for  Region  D  laid  side 
by  side.  This  table  is  different  from  the  ones  we 
came  across  earlier  in  that  we  did  not  start  the  classi¬ 
fication  with  a  total  of  38  values,  to  be  classified 
according  to  Region  and  Quality;  rather  we  had  a  priori 
a  set  of  16  values  for  Region  A  and  22  values  for 
Region  D.  (Further  the  sampling  for  the  two  regions  was 
done  independently) .  In  other  words ,  the  set  of  mar¬ 
ginal  totals  (on  the  one-way  table)  for  Region  was 
fixed  before  the  experiments.  Later  on  we  will  have 

V, 

ample  opportunities  to  see  the  effect'  of  such  restric¬ 
tions  on  the  analyses.  At  present,  it  is  enough  to 
know  that  tables  as  above  may  be  regarded  as  contingency 
tables  with  fixed  (restricted)  marginal  totals. 

3.  Problems  associated  with  contingency  tables 
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In  the  analysis  of  contingency  tables  we  are 


usually  interested  in  the  relationship  between  one 
classification  and  one  or  more  of  the  other  classi¬ 
fications.  Thus  in  the  example  4.  on  comparison  of 
octane  ratings  we  would  like  to  compare  the  variabil¬ 
ity  of  the  values  for  classifications  given  by  Regions 
A  and  D.  As  another  example,  consider  a  three-way 
r  >:  c  x  d  contingency  table  in  which  the  row-classi¬ 
fication  represents  the  response  of  an  experiment 
on  animals,  the  column  classification  types  of  treat¬ 
ment  and  the  depth  classification  sex.  The  following 
hypotheses  may  be  of  interest. 

1.  Response  is  independent  of  treatment  irre¬ 
spective  of  sex. 

2.  Response  is  independent  of  the  different 
combinations  of  treatment  and  sex  (as  against  the 
possibility  that  a  particular  treatment  is  more 
"effective"  in  terms  of  the  response,  for  a  particular 


sex) . 


3.  Given  sex,  response  is  independent  of  treat¬ 
ment. 

We  shall  see  in  subsequent  chapters  how  these 
hypotheses  can  be  formulated  mathematically.  Of  course, 
not  all  contingency  tables  can  be  interpreted  in  such 
a  straightforward  manner.  In  some  instances,  all 
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three  classifications  can  be  considered  as  responses; 
then  we  may  be  interested  in  the  independence  or 
association  among  these  responses.  In  other  cases, 
a  classification  may  be  controlled,  experimentally 
or  naturally,  like  three  specified  levels  of  fertilizer 
applied  or  sex,  in  which  case  the  classification  is 
termed  as  a  factor.  For  convenience,  we  shall  group 
all  the  concepts  of  association,  dependence,  etc. 
under  the  general  term  of  interaction.  No  interaction 
between  treatment  and  sex  appears  to  be  a  more  accept¬ 
able  phrase  than  independence  between  treatment  and  sex, 
since  the  term  independence  is  usually  reserved  to 
express  the  relationship  between  random  variables. 

We  may  also  say  that  the  interaction  between  response 
and  treatment  does  not  interact  with  sex>  meaning  the 
degree  of  association  between  response  and  treatment 
is  the  same  for  both  sexes.  This  concept  gives 
rise  to  the  idea  of  second-order  interaction.  There 
are  a  number  of  different  approaches  to  the  mathematical 
formulation  and  interpretation  of  the  concept  of  "no 
interaction".  One  such  approach,  through  the  concept 
of  "generalized  independence"  is  powerful  and  general 
enough  to  include  all  hypotheses  of  "no  interaction" 
(formulated  in  a  specific  manner)  and  many  other 
hypotheses  about  homogeneity,  symmetry,  etc.  that 
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we  come  across  in  analyzing  contingency  tables. 

Before  this  concept  is  introduced#  we  shall  need  the 
necessary  symbolism  and  notation. 

4.  Notation  and  preliminaries: 

We  have  seen  that  the  entries  in  the  "cells"  of  a 
contingency  table  are  frequencies  of  occurence.  We  will 
denote  these  frequencies  generically  by  the  letter  x, 
with  or  without  subscripts.  These  frequencies  are  a 
result  of  classification  of  a  fixed  number  of  individuals 
according  to  a  certain  probability  distribution.  Hence 
the  observed  frequencies  x  can  be  looked  upon  as  realiz¬ 
ations  of  a  random  variable  X. 

The  cell  of  a  contingency  table  and  the  observed 
frequency  in  that  cell  are  symbolically  associated  in 
the  following  manner.  In  the  example  1,  we  have  a  one¬ 
way  table  representing  the  distribution  of  16723  students 
by  geographic  area.  We  denote  the  occurrence  in  the 
table  by  x(i)  with  the  notation 


Characteristic 

Index 

1  2 

3 

4 

Geographic  area 

i 

East  North 

West 

South 

Thus  x(3) ,  for  example,  equals  2840.  The  total  16723  of 
all  x(i)  for  i»l,2,3  and  4,  will  be  denoted  by  x(.). 


That  is,  x(i)  =  x(.)  =  16723.  For  the  two-way 

table  of  Ex  3,  we  denote  the  frequencies  in  the  table 
by  x(ij)  with  the  notation 


Characteristic 

Index 

1 

2 

3 

4 

Sex 

i 

Male 

Female 

Geographic  area 

j 

East 

North 

West 

South 

Then  x(2,3)  =  1440,  x(l,4)  =  3100  and  so  on.  To  denote 
marginal  totals  we  will  use  the  dot  notation  as  before. 

The  row  marginals  are 

i  x(lj)  =  x (1 . )  =  9051,  l*ml  x(2 j)  *  x(2.)  =  7672 
The  column  marginals  are 

x(il)  -  x(.l)  *=  4201, - -  ll=1  x(i4)  =  x(.4)  *  5130 

The  grand  total  is  denoted  by  x(..}  so  that  x(l.)  +  x(2.)  * 
x ( •  • )  =  x(.l)  +  x(  .2)  +  x ( .  3)  +  x( .  4)  =  16723  *  N 
Now  consider  the  following  three-way  table: 

Propagation  of  plum  root  stocks  from  root-cuttings 


At  once 

Spring 

Response 

(Mortality) 

Lonq 

Short 

Long 

Short 

Totals 

Alive 

156 

107 

84 

31 

378 

Dead 

84 

133 

156 

209 

582 

Totals 

240 

240 

240 

240 

960 

16 
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The  frequencies  in  the  cells  are  denoted  by  x(ijk) 
with  the  notation 


Characteristic 

Index 

1 

2 

Mortality 

i 

Alive 

Dead 

Time  of  planting 

j 

At  once 

Spring 

Length  of  cutting 

k 

Long 

Short 

The  marginals  are  as  follows: 

One-way  marginals:  x(ijk)  =  x(i..),  i“l,2 

lilk  x(ijk)  “  3=1  r2 

x(ijk)  ■  x(..k),  k=*l,2 

Two-way  marginals:  ^  x(ijk)  =  x(.jk),  j*l,2,  k-1,2 

l-  x(ijk)  =  x (i .k) ,  i-1,2  k»l,2 

lk  x(ijk)  *  x(ij.),  i»l  ,2  j-1,2 

Note  that  x(ij.)  =  x(.j.),  ^  x(ij.)  =  x(i..), 
x(i..)  ■  x(...)  etc. 

For  the  above  table,  x(l..)  ■  378,  x(2..)  *  582  and 
x(...)  =  960.  It  should  be  observed  that  x(.jk)  =  240 
for  all  the  four  combinations  of  j  and  k.  This  restriction 
is  imposed  by  the  method  of  experimentation;  for  each 
combination  of  the  planting  time  and  cutting  length 
exactly  240  root-stocks  were  used  and  their  mortality 
observed.  This  is  another  case  of  fixed  marginals. 
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similar  to  the  one  encountered  in  Ex.  4. 

The  notation  for  cell  frequencies  and  for  marginal 
totals  can  be  extended  in  an  obvious  manner  to  four¬ 
way,  five-way  and  higher  order  tables. 

Let  us  now  recall  that  in  a  contingency  table  a 
number  of  individuals  are  classified  into  cells.  In 
other  words  for  a  given  cell,  an  individual  is  classified 
in  the  cell  with  a  certain  probability.  In  a  four-way 
table,  for  example,  each  cell  will  be  denoted  by  (ifj,kv£) 
for  some  values  of  the  indices  i,  j,  k  and  it.  The 
probability  that  an  individual  will  be  classified  in  this 
cell  will  be  denoted  by  p(ijkJl).  Just  as  we  defined  the 
marginal  totals  for  the  cell  frequencies  x(ijkt)  we  may 
define  marginal  totals  for  probabilities.  For  example, 

P (i.  •  • )  =  P<ijk*) 

P  ( -  j  -  A)  ■  P<ijka) 

etc. 

For  a  two-way  table  the  cell  probabilities  will  be 
denoted  by  p(ij),  for  a  three-way  table  by  p(ijk)  and  so 
on.  But  we  would  like  to  develop  the  theory  of  all 
contingency  tables  in  a  unified  manner.  For  this  purpose 
it  is  necessary  to  use  a  symbol,  u>,  say,  which  will 
generically  denote  cells  like  { i j )  in  a  two-way  table, 
(ijk&)  in  a  four-way  table  and  so  on.  For  example, 
in  a  2x3x5  table,  the  symbol  x(w)  will  replace  x(ijk)  , 
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being  one  of  the  2x3x5  =  30  cells.  The  symbol  W  here 
corresponds  to  the  triplet  (ijk)  and  takes  "values" 
(1,1,1),  (1,1,2) ... (1,1,5) ,  (1,2,1) . (2,3,5). 

Let  us  now  go  back  to  some  problems  associated  with 
the  analysis  of  contingency  tables  discussed  in  3, 
cod  see  how  we  can  formulate  them  symbolically,  with  the 
help  of  the  notation  developed.  We  considered  a  rxcx2 
table  in  which  the  row-classification  represents  response 
in  an  experiment  on  animals,  the  column  classification 
represents  types  of  treatment  and  the  depth  classification 
represents  sex.  The  cell  probabilities  are  p(ijk). 

1.  Response  is  independent  of  treatment  irrespective 
of  sex. 

Since  the  sex  of  the  animal  is  immaterial  in  the 
statement  of  the  hypothesis ,  we  consider  marginal  totals 
of  probabilities  of  the  form  p(ij.).  Now,  since  tne 
response  is  postulated  to  be  independent  of  treatment 
we  further  have 

P (ij • )  ~  p(i* • )  p ( • 3 • )  i— l,...,r,  j=l,...c. 

2.  Response  is  independent  of  the  different  combin¬ 
ations  of  treatment  and  sex. 

The  probability  corresponding  to  a  particular  combin¬ 
ation  of  treatment  and  sex  is  given  by  the  (marginal) 
total  p(.jk).The  hypothesis  is  formulated,  therefore. 
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as 

p(ijk)  =  p(i..)  p(.jk)  i=l...r 

j=l . . .c 
k=l  /  2 

3.  Given  sex,  response  is  independent  of  treatment. 
Let  the  conditional  probability  of  being  classified  in 
the  cell  (ijk) ,  given  that  the  individual  is  classified 
in  the  k-th  depth  classification  (sex) ,  be  denoted  by 
p(ij|.k).  Also,  the  marginal  conditional  probability 
of  classification  in  the  i-th  category  irrespective 
of  the  column  classification  is  p(i.k)/p(.  .k)  and  a  similar 
marginal  probability  for  the  j-th  category  of  the  column 
classification,  given  k,  is  p ( . jk) /p ( . .k) .  The  hypothesis 
then  states  that 

p  (i.k)  p  ( .  jk) 

p(ij|k)  =  2  k*l,  2  ,  i*l. .  .r,  j=»l...c. 

P^U.k) 


But  p (i j | k)  =  p(ijk)/p(.  .k) ,  so  that  the  above  relations 
can  be  restated  as 


p(ijk) 


pd.lQpl.3^^ 

pl.’.k) 


1 • • • r ;  • • •  c  • 


Observe  that  P <i j I k)  «  1,  since  given  that  an 

individual  fell  into  the  k-th  category,  it  must  be  classi¬ 
fied  in  one  of  the  (i,j)  cells  corresponding  to  the  fixed  k. 


i 


I 
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This  imposes  the  restriction  that 


EJj  =  1  -  lilj  Ept^P)  -  k  =  !'2 


i.e. 

lilj  P(ijk)  =  p(. .k) ,k=l,2. 

Note  that  the  second  hypothesis  (of  independence) 
ied  us  to  the  formulation  p(ijk)  =  p(i..)  p(.jk)  and 
the  third  hypothesis  (of  conditional  independence)  led 
to  p(ijk)  =  p(  i.k)p(. jk)/p( . .k) .  The  cell-probabilities  in 
each  case  are  expressed  as  products  of  marginal  probabil¬ 
ities.  From  another  point  of  view,  we  can  say  that  the 
trivariate  function  p(ijk)  is  expressed  as  a  product 
of  (simpler)  univariate  and  bivariate  functions,  of  the 
form  p(.jk)  and  p(i..),  for  example.  When  the  cell 
probabilities  are  thus  expressible  as  products  of  functions 
of  a  smaller  subset  of  arguments,  we  say  that  the  probabil¬ 
ities  obey  generalized  independence.  By  generalized 
independence  is  meant  that  the  cell  probability  of  a  multi¬ 
dimensional  contingency  table  may  be  expressed  as  the  product 
of  factors  which  are  functions  of  various  marginals  (Ireland 
and  Kullback,  1968;  Ku  and  Kullback,  1968;  Ku  et  al.,  1971). 
The  common  notions  of  independence,  conditional  independence, 
homogeneity,  or  conditional  homogeneity  in  contingency  tables 
are  all  special  cases  of  generalized  independence.  This  is  a 
consequence  of  the  fact  that  in  accordance  with  the  minimum 
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discrimination  information  theorem,  the  m.d.i.  estimates  are 
formulated  as  members  of  an  exponential  family,  which  may 
also  be  expressed  as  a  multiplicative  model  or  a  logarithmic 
linear  additive  model  (Kullback,  1959;  Ireland  and  Kullback, 
1968;  Ku  et  al.,  1971).  Note  that  we  do  not  assume  such  a 
model  to  start  with,  as  others  have,  but  derive  this  model  by 
the  principle  of  minimum  discrimination  information  estimation 
(Birch,  1963;  Bishop,  1967,  1969;  Goodman,  1970;  Mantel, 

1966)  . 

5 .  Estimates 

We  shall  denote  estimates  of  the  cell  entries  under 
various  hypotheses  or  models  by  xJ(oj),  where  values  of  the 
subscript  a  will  range  over  the  hypotheses  or  models. 

For  two-way  2x2  tables  the  primary  question  of  interest 
is  whether  the  row  and  column  variables  are  independent.  An 
example  of  such  a  table  is  shown  in  Table  1. 


Table  1. 
x(ij) 


j  -  1 

j  -  2 

i  «  1 

x(ll) 

x(12) 

x(l*) 

i  -  2 

x(21) 

x  (22) 

x(2» ) 

x  ( •  1) 

x(*2) 

x(**)  * 

22 
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To  answer  this  question  one  estimates  the  cell  entries 

under  the  hypothesis  of  independence  as  a  product  of  the 

* 

marginals,  that  is,  denoting  the  estimate  by  x  (ij)  one  uses 

x*(ij)  =  x(i*)x(*  j)/n.  Some  appropriate  measure  of  the 

* 

deviation  between  x(ij)  and  x  (ij)  is  then  used  to  determine 
whether  the  differences  are  "larger"  than  one  would  reasonably 
expect  under  the  hypothesis  of  independence. 

The  estimated  two-way  table  under  the  hypothesis  or  model 
of  independence  is  given  in  Table  2. 


Table  2. 

ESTIMATE  UNDER  INDEPENDENCE 


Note  that  the  estimated  table  has  the  same  marginals  as  the 
observed  table  x(ij) . 
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A  common  statistical  measure  of  the  association,  or  Interaction 
between  the  variables  of  a  two-way  2x2  contingency  table  is  the  cross- 
product  ratio,  or  Its  logarithm.  The  cross-product  ratio  is  defined  by 


x(ll)x(22) 
x(12)x(21)  * 


though  we  shall  be  more  concerned  with  its  logarithm 


tn 

x(12)x(21)  ‘ 

We  shall  use  natural  logarithms,  that  is,  logarithms  to  the  base  e  , 
rather  than  common  logarithms  to  the  base  10,  because  of  the  nature  of  the 
underlying  mathematical  statistical  theory.  Note  that  with  the  estimate 
for  independence,  or  no  association,  the  logarithm  of  the  cross-product 
ratio  is  zero. 

*  *  x(l«)x(‘l)  x(2*)x(‘2) 

-  =  In  l  »  o  . 


In  xOl)*J22)  .  £n 


x*(12)x  (21) 


x(l*)x(*2)  x(2* )x(»l) 
n  it 


The  logarithm  of  the  cross-product  ratio  is  positive  if  the  odds  satisfy 
the  inequalities 


x(ll)  >  x(12)  x(ll)  x(21) 

x(21)  x(22)  x(12)  x(22)  ’ 


since  then  we  get  for  the  log-odds 


In  .  tniiiil 

x(12)x(21)  x(21) 

x(12) 


x(22) 

,n  x(21)  „ 


The  logarithm  of  the  cross-product  ratio  is  negative  if  the  odds  satisfy  • 
the  inequalities 

*OU  <  *021  or  xOi).  < 

x(21)  x(22)  or  x(12)  x(22)  » 

since  then  we  gat  for  the  log-odds 
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-n-oduct  ratio  thus  varies  from  -®  to 
The  logarithm  of  the  cross  .  assQ8siag  the  significance  of  the 

I,ater  we  shall  consider  procedures  zcro,  the  value 

deviation  of  the  logarithm  of  the  ■ 

corresponding  to  no  association  or 

.  the  case  of  a  two-way  rxc  contingency 

Similar  procedures  apply  to 

i  #.v»  t  rows  &nd  c  columns  • 
table,  that  is,  one  with  r  rows  ana 

TABLE  3a 

TWO-WAY  rxc  CONTINGENCY  TABLE 


1  TmT  *(22)  •••  «(2c> _ xiili 


x(rl)  1  *(t2) 
x(*l)  I  *(‘27 


...  x(rc)  |x(r*) 
...  x ( " c )  '  n” 


.  „  mu  and  column  categories 

Under  a  hypothesis  or  node!  of  nay.  are  not  randomly 

»*W)  •  *(i-)x(-3)/n  •  * *  to  some  characteristic,  say  time  or 

observed  but  selected  wi  P  fch#  same  for  determining 

space,  the  mathematical  prece  ovjr  the  r0„  categories, 

whether  the  column  eategor  consider  the  two- 

time  or  spec,  for  instance.  In  the  latter 
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way  table  as  a  set  of  one-way  tables.  Terms  which  cover  both  the  case  of 
Independence  and  homogeneity  are  "association"  or  "interaction,"  that  is, 
we  question  whether  there  is  association  or  interaction  among  the  variables. 

The  estimated  two-way  rxc  contingency  table  under  the  hypothesis 
or  model  of  Independence  is  given  in  Table  3b. 

TABLE  3b 

ESTIMATE  UNDER  INDEPENDENCE 
x*(ij) 


1X^ 

1 

2 

see 

c 

1 

x(l*)x(*l)/n 

x(l*)x(*2)/n 

•  •  • 

x(l*)x(*c)/n 

x(l* ) 

2 

x(2*)x(*l)/n 

x(2*)x(*2)/n 

•  •  • 

x(2*)x(*c)/n 

x(2*) 

• 

• 

• 

ill 

•  •  • 

•  •  e 

see 

s  a  e 

r 

x(r*)x(*l)/n 

x(r*)x(*2)/n 

see  | 

x(r* )x(*c)/n 

x(r-) 

x(*l) 

x(*2) 

1 

•  •  •  1 

x(*c) 

n 

Note  that  the  estimated  table  has  the  same  marginals  as  the  observed 
Table  3a. 


A  three-way  contingency  table  arises  when  each  observation  has  three 
classifications  with  different  possible  numbers  of  categories  for  each 
classification.  The  simplest  three-way  contingency  table  is  2x2x2,  that 
is,  with  two  categories  for  each  classification. 
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In  the  general  notation  we  have  Table  4 


TABLE  4 


i 

_  J  -  1 

k  -  1  x(lll) 

k  -  2  x(112) 

__ 


1 


i  - 


J  "  2 
x(121) 
x(122) 
x(12*) 


J  -  1 

x(211) 
x(212) 
x(21* ) 


The  two-way  marginals  are 

x(ll*>  -  x(lll)  +  x(112), 
x(12* )  -  x(121)  +  x(122), 
x(21*)  -  x(211)  +  x(212), 
x(22* )  -  x(221)  +  x(222), 
x (1  *  1)  -  x(lll)  +  x(121), 
x(l*2)  -  x(112)  +  x(122), 
x(2*l)  -  x(211)  +  x(221), 
x(2*2)  *>  x(212)  +  x(222), 
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x(*ll)  -  x(lll)  +  x(211)  , 
x(*12)  -  x(112)  +  x(212)  , 
x(*21)  -  x(121)  +  x(221)  , 
x(*22)  -  x(122)  +  x(222)  . 

The  one-way  marginals  are 

x (1* • )  -  x(lll)  +  x(112)  +  x(121)  +  x(122)  -  x(lV)  +  x(12‘)  , 

x(2* • )  -  x(211)  +  x(212)  +  x(221)  +  x(222)  -  x(21*)  +  x(22*)  , 

x (* 1* )  -  x(lll)  +  x(112)  +  x(211)  +  x(212)  -  x(ll* )  +  x(21‘)  , 

x(*2*)  -  x(121)  +  x(122)  +  x(221)  +  x(222)  -  x(12*)  +  x(22‘)  , 

x(* *1)  -  x(lll)  +  x(121)  +  x(211)  +  x(221)  -  x(l*l)  +  x(2*l)  , 

x(* *2)  -  x(112)  +  x(122)  +  x(212)  +  x(222)  -  x(l‘2)  +  x(2*2)  . 

The  c-utries  x(ijk)  In  Table  4  miy  also  be  considered  as  three-way 
marginals. 

With  more  variables  there  are  more  possible  questions  of  Interest. 

One  may  be  Interested  In  whether  any  pair  of  the  variables  are  Independent 
or  show  no  Interaction  or  association.  One  nay  be  interested  in  condi¬ 
tional  independence,  that  is,  whether  t  pair  of  variables  are  independent 
given  the  third  variable.  One  may  be  interested  in  whether  the  three 
variables  are  mutually  Independent  or  whether  one  of  the  variables  is 
Independent  of  the  pair  of  the  other  variables.  These  questions  of  inde¬ 
pendence,  no  interaction  or  association  are  all  answered  by  considering 
estimates  which  are  explicitly  represented  in  terms  of  products  of 
various  marginals.  We  list  some  of  these  estimates. 

Mutual  independence  of  i,  j,  and  k  x^(ijk)  -  x(i* •)x(,J*)x(* *k)/n  , 

Independence  of  i  and  (jk)  jointly  x*(ijk)  -  x(i» *)x(*Jk)/n  , 

d 

Conditional  Independence  of  i  and  j  given  k  x£(ijk)  ■  x(i*k)x(* jk)/x(* *k)» 

As  might  be  expected,  these  estimates  also  apply  in  the  general  three-way 
rxsxt  contingency  table. 

We  note  that  the  estimate  under  mutual  Independence  of  i  ,  j  , 
and  k  has  the  same  one-way  marginals  as  the  observed  table  x(ijk)  , 
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x*(lll)  -  x(l**  )x(*  1*  )x(**  l)/n2  , 
x*(112)  -  x(l»*)x(*l*)x(**2)/n2  , 
x*(121)  ■  x(l*  •  )x(*  2*  )x(*‘  l)/n2  , 
x*(122)  -  x(l*  •  )x(»  2*  )x(**  2)/n2  , 
x*(211)  -  x(2» • )x(»l» )x(*  •  l)/n2  , 
xj(212)  -  x(2* • )x(*l* )x(** 2)/n2  , 
x*(221)  -  x(2**)x(*2,)x(**l)/n2  , 
x*(222)  -  x(2*  •  )x(*2*  )x(*  • 2)/n2 , 
xj(l-)  -  xj(lll)  +  x*(112)  +  x*(121)  +  xj(122) 

*  x(l»*)x(*l*)/n  +  x(l**)x(*2*)/n 

■  x(l**)  , 

x*(2* • )  -  x*(211)  +  x*(212)  +  x*(221)  +  xj(222) 

■  x(2*‘)x(*l*)/n  +  x(2,,)x(*2*)/n 

-  x(2* • )  . 

xj(«l«)  -  x*(lll)  +  x*(112)  +  x*(211)  +  xj(212) 

-  x(l**)xC*l*)/n  +  x(2**)x(,l,)/n 

-  x(*l*)  , 

x* ( • 2 * )  -  x*(121)  +  x*(122)  +  x*(221)  +  xj(222) 

-  x(*2*)  , 

xj(«*l)  -  xj(lll)  +  x*(121)  +  x*(211)  +  x*(221) 

-  x(”l)  , 

xj(**2)  -  x*(112)  +  x*(122)  +  xj(212)  +  xj(222) 

-  x(*  *2) . 

However,  the  two-way  marginals  of  the  estimate  under  mutual  Independence 
of  i  ,  j  ,  and  k  differ  from  the  two-way  marginals  of  the  observed 
table  x(ljk)  .  Thus,  for  example  , 
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xj(ll-)  -  xj(lll)  +  x*(112) 

-  x(l*’)x(*l*)x(**l)/n2  +  x(l“)x(*l*)x(**2)/ti2 

-  x(l**)x(*l*)/n  , 

and  the  latter  value  Is  not  necessarily  equal  to  x(ll*)  . 

The  estimate  under  the  hypothesis  or  model  of  independence  of  1 
and  (jk)  jointly  has  the  same  one-way  marginals  and  the  same  two-way 
jk-marglnal  as  the  observed  table  x(ijk)  , 

x*(lll)  -  x(l,,)x(*ll)/n  , 
x*(112)  -  x(l* • )x(* 12)/n  , 
x*(121)  -  x(l**)x(*21)/n  , 
x*(122)  -  x(l**)x(*22)/n  , 

B 

x*(211)  -  x(2**)x(*ll)/n  , 

x*(212)  -  x(2* *)x(*12)/n  , 
a 

x*(221)  -  x(2**)x(*21)/n  , 
a 

x*(222)  -  x(2* *)x(*22)/n  , 

a 

x* (1 • • )  -  x*(lll)  +  x*(112)  +  x*(121)  +  x*(122) 
a  a  a  a  a 

-  x(l* Oxt'lD/n  +  x(l»*)x(*12)/n  +  x(l* •)x(*2l)/n  +  x(l* •)x(*22)/n 

-  x(l • ' ) [x( • 11)  +  x(*12)  +  x(*21)  +  x(*22) ]/n 
■  x(l**)  . 

Similar  results  follow  for  the  other  one-way  marginals. 
x*(-ll)  -  x*(lll)  +  x*(211) 

BBS 

■  x(l* *)x(*ll)/n  +  x(2* «)x(»ll)/n 

-  x(*li)  , 

x*(-12)  -  x*(112)  +  x*(212) 

-  x(l* *)x(*12)/n  +  x(2* •)x(*12)/n 

-  x(*12)  , 
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x*(-21)  -  x* (121)  +  x*(221) 
a  a  a 

■  x(l* • )x(’21)/n  +  x(2* • )x(*21)/n 

-  x(-21)  , 

x*(-22)  -  x*(122)  +  x*(222) 
a  a  a 

-  x(l**)x(*22)/n  +  x(2* *)x(*22)/n 

-  x(*22)  . 

However,  for  the  other  two-way  marginals,  for  example, 

x*(ll-)  -  x*(lll)  +  x*(112) 
a  a  a 

■  x(l,*)x(*ll'/n  +  x(l* •)x(,12)/n 

-  x(l**)[x(* 11)  +  x(*12) ]/n 

-  x(l**)x(*l*)/n  , 

and  the  latter  value  Is  not  necessarily  equal  to  x(ll*)  . 

x*(l-l)  -  x*(lll)  +  x*(121) 
a  a  a 

-  x(l**)x(*ll)/n  +  x(l* • )x(*21)/n 

-  x(l* • ) [x(* 11)  +  x(*21) ]/n 

-  x(l* *)x(* *l)/n  , 

«iud  the  latter  value  is  not  necessarily  equal  to  x(l*l)  . 

The  estimate  under  the  hypothesis  or  model  of  conditional  inde¬ 
pendence  of  i  and  j  given  k  has  the  same  one-way  marginals  and  the 
same  two-way  ik-  and  jk-marginals  as  the  observed  table  x(ljk)  , 

xj(lll)  -  x(l*l)x(*ll)/x(* *1)  , 
xj(112)  -  x(l-2)x(-12)/x(’-2)  , 
xj(121)  -  x(l«l)x(*21)/x(» •!)  , 
xj(122)  -  x(l*2)x(*22)/x(*  *2)  , 
x£<211)  -  x(2*l)x(*ll)/x(**l)  , 
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x(2-2)x(-12)/x(*-2)  , 


(212)  - 

x£(221)  -  x(2*l)x(*21)/x(*  *1)  , 
x£ (222)  -  x(2*2)x(*22)/x(*  *2)  , 
x£<l**)  -  xj(lll)  +  x£ (112)  +  x£(121)  +  x£ (122) 

-  x(l*l)x(,ll)/x(**l)  +  x(l,2)x(*12)/x(,,2) 

+  x(l*l)x(*21)/x(*  ’1)  +  x(l*2)x(*22)/x(,*2) 

-  x(l*l)  +  x(l*2)  -  x(l *  * )  . 

Similar  results  follow  for  the  other  one-way  marginals. 

x£(l-l)  -  xj(lll)  +  x£(121) 

-  x (1* l)x(*ll)/x(**l)  +  x(l*l)x(*21)/x(*  *1) 

-  x(l-l)  , 

xJ(l-2)  -  xj(112)  +  xj(122) 

-  x(l*2)x(*12)/x(**2)  +  x(l,2)x(*22)/x(**2) 

-  x(l*2)  , 

and  in  a  similar  manner  we  have 

x£(2*1)  -  x(2* 1)  ,  x£(2*2)  -  x(2*2)  , 
xjj(-ll)  -  xj(lll)  +  x£<211) 

-  x(l*l)x(*ll)/x(*‘l)  +  x(2*l)x(*ll)/x(*  *1) 
"  x(*ll)  , 

xj(-12)  -  xj<112)  +  xj(212) 

-  x(l*2)x(*12)/x(**2)  +  x(2*2)x(*12)/x(*  *2) 

-  x(*12)  , 

and  in  a  similar  manner  we  have 

x£(-21)  -  x(‘21)  ,  xJ<-22)  -  x(*22)  . 

However,  for  the  other  two-way  marginals 


32 


26 


xj(ll-)  =■  x£(lll)  +  x£(112) 

=  x(l*l)x(-ll)/x(*-l)  +  x(l*2)x(*12)/x(**2)  , 

and  the  latter  value  is  not  necessarily  equal  to  x(ll*)  . 

We  remark  that  one  of  the  constraints  In  the  determination  of  the 
estimates  was  that  they  have  certain  marginals  the  same  as  the  observed 
table. 

For  the  three-way  2x2x2  contingency  table  in  addition  to  the 
classic  types  of  independence,  interaction  or  association,  there 
arises  an  additional  one,  important  historically  and  practically. 
This  is  known  as  no  three-factor 


or  no  second-order  interaction.  No  three-factor  or  no  second-order 
interaction  implies  that  the  logarithm  of  the  association  measured  oy  the 
cross-product  ratio  for  any  two  of  the  variables  is  the  same  for  all  the 
values  of  the  third  variable,  that  is,  there  is  no  second-order  interaction 
if 

.  ln  *(112)»(222)  ±  . 
x(122)x(212)  •  .  •  3  ’ 

■  x(121)«(222) 

x(122)x(221)  *  '  ’ 

.  ln  x(211)x(222) 

x(212)x(221)  ’  K  • 

One  is  concerned  with  the  possible  hypothesis  or  model  of  no 
second-order  interaction  when  none  of  the  other  types  of  independence  are 
found,  tjowever,  in  this  case,  the  corresponding  estimate  cannot  be  ex¬ 
pressed  explicitly  in  terms  of  observed  marginals  although  the  estimate 
i3  constrained  to  have  ha  sane  two-way  marginals  as  the  observed  table. 
Straightforward  itei /’i.i'.p  procedures  exist  to  determine  the.  estimate 
under  the  hypothec  '  o  .  del  of  no  second-order  interaction.  For  the. 
general  three-way  contingency  table  there  arc  of  course  many  more 

relations  among  the  .',oz  cross-product  ratios  like  (1)  which  must  be 
satisfied,  but  the  iterative  procedures  to  determine  the  estimate  extend 
to  the  general  case  with  no  difficulty. 
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We  may  be  concerned  with  a  set  of  two-way  tables  for  which  it  is 
of  Interest  to  determine  whether  they  are  homogeneous  with  respect  to  a 
third  factor,  say  space  or  time.  Such  problems  may  also  be  treated  as 
three-way  contingency  tables  using  the  space  or  time  factor  as  the  third 
classification  (Kullback,  1959). 

For  four-way  and  higher  order  contingency  tables  the  problem  of 
presentation  of  the  data  increases,  as  do  the  variety  and  number  of 
questions  about  relationships  of  possible  interest  and  varieties  of 
interaction.  The  basic  ideas,  concepts,  notation  and  terminology  we  have 
discussed  for  the  two-  and  three-way  contingency  cables  extend  to  the  more 
general  cases  as  we  consider  the  methodology  (Ku  et  al.,  1971). 
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3j _ Log-linear  Representation 


1.  Minima  Discrimination  Information  Estimation 

To  make  the  presentation  more  specific ,  and  with  no  essential 

restriction  on  the  generality,  we  discuss  it  In  terns  of  the  analysis 

of  four-way  contingency  tables.  Let  us  consider  the  collection  of 

four-way  contingency  tables  RxSxTxU  of  dimension  rxsxtxu.  For 

convenience  let  us  denote  the  aggregate  of  all  cell  Identifications,  as 

well  as  their  number,  by  0  with  Individual  cells  Identified  by  w,  so  that 

the  generic  variable  is  u  ■  (i,J,k,l),  l>l,...,r,  J-l . .  k»l,...,t, 

t"l,...,u.  In  this  case  we  also  identify  ft  as  ratu.  Suppose  there  are 

two  probability  distributions  or  contingency  tables  (we  shall  use  these 

terms  Interchangeably)  defined  over  the  aggregate  or  spaca  ft,  say  p («), 

v(b>),  £  p(u>)  -  1,  £  ir(u)  -  1.  The  discrimination  Information  Is  defined 
ft  ft 

by 


I(p:ir)  -  £  p(u>)  tn  . 

ft  ' ' 

For  the  various  applications  we  shall  consider  the  ^-distribution, 
v((d),  according  to  the  problem  of  Interest,  may  either  be  specified,  nay 
be  an  estimated  distribution,  or  may  bs  an  obssrvad  distribution.  The 
p-dlstrlbutlon,  p(w),  ranges  over  or  Is  a  asnbar  of  a  family  P  of 
distributions  of  Interest  satisfying  certain  restraints. 

Of  the  various  properties  of  I(p:w)  we  mention  In  particular  the 
fact  that  I(p:w)  >  0  and  •  0  if,  and  only  If,  p(u)  -  *(&>)  (Fullback,  1959). 
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Many  problems  In  the  analysis  of  contingency  tablaa  nay  ba 
characterised  as  estimating  a  distribution  or  contingency  table  subject 
to  certain  reatralnts  and  then  comparing  the  eat lna ted  table  with  an 
observed  table  to  determine  whether  the  observed  table  satiaflea  a  null 
hypothesis  or  model  Implied  by  the  reatralnts.  In  accordance  with  the 
principle  of  Minimum  discrimination  Information  estimation,  we  determine 
that  member  of  the  collection  or  family  P  of  distributions,  which  minimises 
the  discrimination  Information  I(p:ir).  We  denote  the  minimum  discrimination 
Information  estimate  by  p*(<*>)  ao  that 

I(p*:»)  •  £  P*(w)  In  *in  I(p:»).  p*  P*  e  p. 

Unless  otherwise  stated,  the  summation  la  over  fl  which  will  be  omitted. 

It  nay  be  shown  that  If  p(u)  la  any  member  of  the  family  P  of 
distributions,  then 

(1)  I(p:»)  -  I(p*:w)  ♦  I(p:p*). 

The  Pythagorean  type  property  (1)  plays  an  Important  role  in  the  analysis 
of  Information  tables. 

In  a  wide  class  of  problems  which  can  be  characterised  as  "smoothing" , 
or  fitting  a  model  to  an  Observed  contingency  table  the  restraints  specify 
that  the  estimated  distribution  or  contingency  table  have  soma  set  of 
marginals,  or  more  generally^ linear  functions  of  observed  cell  entries, 
equal  to  those  values  for  the  observed  contingency  table.  In  such  cases 
ir(u>)  Is  taken  to  be  either  the  uniform  distribution  w(ljkl)  ■  1/rstu,  or 
a  distribution  already  estimated  subject  to  restraints  contained  In  and 
implied  by  the  restraints  under  examination.  The  latter  case  includes 
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the  classical  hypotheses  of  independence ,  conditional  independence, 
homogeneity,  conditional  homogeneity  and  interaction,  all  of  which  can 
be  considered  as  Instances  of  generalised  Independence. 

To  test  whether  an  observed  contingency  table  is  consistent  with 
the  null  hypothesis,  or  model,  as  rapresented  by  tho  minimum  discrimination 
information  estimate,  we  compute  a  measure  of  the  deviation  between  the 
observed  distribution  and  the  appropriate  estimate  by  the  minimum 
discrimination  information  statistic.  For  notitlonal  and  computational 
convenience,  let  us  denote  the  estimated  contingency  table  in  terms  of 
occurrences  by  x*(u>)  •  np* (ul)  where  n  is  the  total  number  of  occurrences. 
For  the  "smoothing"  or  fitting  class  of  problems,  that  is,  with  the 
restraints  implied  by  a  sat  of  observed  marginals  (those  of  a  generalized 
independence  hypothesis),  or  more  generally, linear  functions  of  observed 
all  entries,  the  minimum  discrimination  information  (m.d.l.)  statistic  is 


2I(x:x*)  ■  21  x(w)  in  — 

x  (w) 


which  is  asymptotically  distributed  as  a  chi-squared  variate  with 
appropriate  degrees  of  freedom  under  the  null  hypothesis. 

The  statistic  in  (2)  is  also  minus  twice  the  logarithm  of  the 
classic  likelihood  ratio  statistic  but  this  is  not  necessarily  true  for 
other  kinds  of  applications  of  the  general  theory  (Berkson,  1972). 


2 .  Computational  Procedures 

An  "experiment"  has  been  designed  and  observations  made  resulting 
in  a  multidimensional  contingency  table  with  the  desired  classifications 
and  categories.  All  the  information  the  analyst  hopes  to  obteln  from  the 
"experiment"  is  contained  in  the  contingency  table.  In  the  process  of 
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analysis,  Cha  ala  la  to  fit  tha  obaarvad  :abla  with  a  nlnlaal  or 
paralaonloua  number  of  paraaatars  dapandlrg  on  soaa  of  tha  obaarvad 
aarglnals,  and/or  aoaa  general  llnaar  coablnatlons  of  obaarvad  cell 
entries,  that  la,  essentially,  to  find  out  how  such  of  this  total 
Information  Is  contalnad  In  a  ausaary  consisting  of  sets  of  marginals, 
and/or  aoaa  llnsar  coablnatlons  of  obaarvad  cell  entries. 

Indssd,  the  relationship  between  the  concept  of  Independence  or 
association  and  interaction  lu  contingency  tables  and  the  role  the 
aarglnals  play  Is  evidenced  In  tha  historical  devalopaants  in  the 
extensive  literature  on  the  analysis  of  contingency  tables. 

Let  us  denote  by  x  tha  (2x1  aatrix  of  entries  x(«)  of  the  observed 
contingency  table  arranged  in  lexicographic  order,  and  denote  by  T  an 
(2x(afl)  design  aatrix  of  rank  atl  <  fl.  Vs  danote  the  coltams  of  T  by 
T^(w) ,  1  <  w  <  ft,  0  <  1  <  a.  The  condition  that  the  estimate  x*(o>)  have 
some  sat  of  aarglnals,  and/or  soaa  general  linear  combination  of  cell 
entries,  equal  to  tha  corresponding  values  of  the  observed  contingency 
table  Is  written  In  aatrix  notation  as 

(3)  T'x*  -  T'x  . 

%  V  •»  V 

Those  eoluans  of  T  which  laply  a  marginal  restraint  are  tha  Indicator 
functions  of  the  aarglnals,  that  Is,  the  corresponding  T^Cw)  will  be  one 
or  aero  for  any  cell  co,  according  as  tbs  cell  u  does  or  does  not,  enter 
Into  the  marginal  In  question.  We  usually  taka  Tq (u)  -  1,  for  all  u,  so 
that  £x*(w)  ■  £  x(e>)  ■  n.  In  accordancs  with  tha  alnlaua  discrimination 
information  thsoraa  (Kullback,  1959),  the  a.d.l.  estimate  is  the 
exponential  f sally 

(♦)  x*(o>)  -  exp(T0T0(w)4r1T1  («)+... +TBTB(w))nir(w). 
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If  we  denote  the  Qxl  matrix  whose  entries  ere  £.n(x*(a))/n7r(u))) ,  In 

A 

lexicographic  order  on  u  bt  (x  /nir) ,  then  we  have  from  (A)  the  log- 
linear  regression  (Gokhale,  1971,  1972;  Ku  et  el.  1974) 

(5)  in(x*/nir)  -  T  T  , 

*  •  •»  “>* 

where  t  is  the  (mfl)xl  matrix  of  the  parameters  Trt,T.  x  .  We  set 

—  u  l  i  n 

the  normalising  parameter  Tq"L  and  T^, . . .  »TB  *ra  main  effects  and 
interactions.  The  parameters  in  (4)  are  to  be  determined  so  that  x*(w) 
satisfies  the  condition  (3).  There  are  convergent  iterative  computer 
algorithms  of  proportional  fitting  (among  others),  which  yield  the 
estimate  x*(u>)  satisfying  (3),  and  then  the  parameters  are  determined 
from  (5).  The  Iteration  nay  be  described  as  successively  cycling  through 
adjustments  of  the  marginals  of  interest  starting  with  the  ir(w)  distribution 
until  a  daalred  accuracy  of  agreement  between  the  set  of  observed  marginals 
of  Interest  and  the  computed  marginals  has  been  attained.  See  Ku  et  el. 
(1971).  Note  that  although  mr(w)  is  hero  a  constant  and  could  be  absorbed 
into  Tq  or  L,  we  prefer  to  express  it  explicitly  because  there  are  cases 
in  which  nir (<u)  is  not  a  constant  and  the  expression  in  (4)  or  (5)  still 
applies  (Ireland  and  Kullback,  1968a,  b;  Gokhale,  1971;  Darroch  and 
Ratcliff,  1972). 

3.  Analysis  of  Information 

The  analysis  of  information  is  based  on  the  fundamental  relation 
(1)  for  the  minimum  discrimination  information  statistics.  Specifically 
if  np*(u>)  ■  x  (to)  is  the  minimum  discrimination  information  estimate 
corresponding  to  a  set  of  given  marginal^,  and  x^(ui)  is  the  minimum 
discrimination  information  estimate  corresponding  to  a  set  of  given 


aarginala,  where  H#  Is  explicitly  or  lap lie it ly  contained  in  H^,  then  the 
beaic  relatione  are 


/  2I(x:mt)  ■  2I(x*:nw)  +  2I(x:x*) 

J  2I(x:mr)  ■  2I(x£:nw)  +  2I(x:x£) 

\  2I(x£:nir)  ■  2l(x*:nr)  +  2I(x£:x*) 
l  2I(x:x*)  -  2I(x£:x*)  +  2I(x:x£) 


with  a  correaponding  additive  relation  for  the  aaaoclated  degreaa  of 
freadoa. 

In  taraa  of  the  rapraaantatlon  in  (4)  or  (5),  aa  an  exponential 
faally,  the  two  extraaa  caaaa  are  the  unifora  dlatrlbution  for  which  all 
T'a  except  L  are  zero,  and  the  obaerred  contingency  table  or  dlatrlbution, 
the  coaplete  nodal,  for  which  all  fl-1  ■  ratu  -  1  T'a  in  addition  to  L  are 
needed. 

Maaeurea  of  the  fora  2I(x:x*),  that  ia,  the  coapariaon  of  an 
obaerred  contingency  table  with  an  eatlaated  contingency  table,  are 
called  aaaaurea  of  interaction  or  goodnaaa-of-fit.  Maaeurea  of  the  fora 
2I(x£:x*) ,  coaparlng  two  eatlaated  contingency  tablea,  are  called  aaaaurea 
of  effect,  that  la  tha  effect  of  the  aarglnala  in  the  aet  ^  but  not  in 
the  aet  H  ,  or  tha  taua  in  x.  but  not  in  x  .  We  note  that  2l(x:x  )  teata 

ft  DA  A 

a  null  hypothaale  that  the  raluaa  of  the  T  paraaatera  in  the  repreaentation 
of  tha  obaerved  contingency  table  x(w)  but  not  in  the  repreaentation  of 
the  eatlaated  table  x*(a»)  are  zero  and  the  nunber  of  theae  taua  la  the  nuaber 
of  degreaa  of  freadoa.  Slallarly  2I(x£:x*)  teata  a  null  hypotbeala  that 
the  valuaa  of  tha  eat  of  T  paraaatera  in  the  repreaentation  of  tha  eatlaated 
table  x£(w)  but  not  in  tha  repreaentation  of  tha  eatlaated  table  x*  («)  are 
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taro,  and  the  number  of  theee  taus  la  the  number  of  degrees  of  freedom. 

See  section  5,  The  2x2x2x2  Table. 

We  summarize  the  additive  relationships  of  the  m.d.i.  statistics 
and  the  associated  degrees  of  fresdom  In  the  Analysis  of  Information  Table  1. 

TABLE  1 

ANALYSIS  OF  INFORMATION  TABLE 

Component  due  to _ Information _ D.F. 

B  :  Interaction  2I(x:x  )  N 

a _  a _ a 

Hb:  Effect  2I(x£:x*)  N#  -  Nfe 

Interaction  2I(x:x£)  Nfe 

Since  measures  of  the  form  2I(x:x#)  may  also  be  interpreted  as  measures 
of  the  "variation  unexplained"  by  the  estimate  x*,  the  additive  relationship 
leads  to  the  Interpretation  of  the  ratio 

2l(x:x*>  -  2I(x:xf)  2l(xJ:r*) 

(7)  - S - 5 - J - “M-  > 

2I(x:xa)  2l<x:x#) 

as  the  percentage  of  the  unexplained  variation  due  to  x*  accounted  for  by 
the  sdditlonal  constraints  defining  x£.  The  ratio  (7)  Is  thus  similar  to 
the  squared  correlation  coefficients  associated  with  normal  distributions 
(Goodman,  1970). 

We  remark  that  the  marginals,  explicit  and  Implicit,  of  the  estlmetad 
* 

table  x^Oo) ,  which  form  the  set  of  restraints  H#  used  to  generate  x^iuj)  are 
the  same  as  the  corresponding  marginals  of  the  observed  x(w)  table  and  all 
lower  order  Implied  marginals.  It  nay  be  shown  that  2I(x:x#)  is  approxi¬ 
mately  a  quadratic  In  the  differences  between  the  remaining  marginals  of 


the  x(w)  table  and  tha  corresponding  ones  as  calculated  froa  xg(w). 

*  * 

Similarly,  2l(x^:x^)  Is  also  approxinately  a  quadratic  in  the 
differences  between  those  additional  narglnal  restraints  in  but  not  in 
II#  and  the  corresponding  aarginal  values  as  computed  from  the  x* (u)  table. 

The  T 'a  are  determined  froa  the  log-linear  regression  equations  (5) 

as  suns  and  differences  of  values  of  in  x*(a>)  or  as  linear  combinations 

thereof.  A  variety  of  statistics  have  been  presented  in  the  literature  for 

the  analysis  of  contingency  tables,  which  are  quadratics  in  differences  of 

marginal  values  or  quadratics  in  the  t's  or  the  linear  combinations  of 

logarithms  of  the  observed  or  estimated  valuoa.  The  principle  of  minimum 

discrimination  Information  estimation  and  its  procedures  thus  provides  a 

unifying  relationship  since  such  statistics  may  be  seen  aa  quadratic 

approximations  of  the  minimum  discrimination  information  statistic.  He 

2 

remark  that  the  corresponding  approximate  X  'a  are  not  generally  additive 
(Berkson,  1972). 

He  mention  the  approximations  in  terms  of  quadratic  forms  in  the 
marginals,  or  the  t's,  as  a  possible  bridge  to  relate  the  familiar 
procedures  of  classical  regression  analysis  and  tha  procedures  proposed 
here.  This  may  assist  in  understanding  and  interpreting  tha  analysis  of 
information  tables  (Kullback ,  1959).  The  covariance  matrix  of  tha  T(u>) 
functions  or  the  taus  can  be  obtained  for  either  the  observed  table  or 
any  of  the  estimated  tables,  as  well  as  tha  Inverse  matrices,  as  part  of 
the  output  of  the  general  computer  program. 


4.  The  2x2  Table 


It  may  be  useful  to  reexamine  the  2x2  table  from  the  point  of  view 


of  the  preceding  discussion.  The  algebraic  details  are  simple  in  this 
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cam  and  exhibit  the  unification  of  tha  information  thaoratlc  development . 


Suppose  we  have  tha  obaarrad  2x2  table  In  Figure  1 


x(U) 

*(12) 

*(1.) 

*(21) 

*(22) 

*(2.) 

x(.l) 

*(.2) 

n 

Figure  1 

If  va  obtain  tha  a.d.l.  aatlaata  fitting  tha  one-way  aarglnals,  tha 
ganarallaad  lndapandanca  hypothesis  la  tha  claaalcal  Independence 
hypothesis  and  the  ninlnon  discrimination  Information  aatlaata  la  the 
usual  x*(lj)  -  x(i.)x(.j)/n.  By  the  Iterative  scaling  fitting  procedure, 
va  begin  with  x^(lj)  ■  n/4  In  each  cell  and  adjust  the  x^(lj)  values 
by  the  ratios  of  the  observed  row  aarglnals  to  those  of  x^(ij),  that  la, 

x(1)(lj)  -  x<°>(ij)  *(i.)2  . 

Then  we  adjust  x^lj)  by  the  ratio  of  observed  coluan  aarglnals  to  the 
aarglnals  of  x(1)(lj). 


*<2>(ij)  -  *(1>(ij) 


■  x(l.)x(.J)/n  -  x  (ij). 


Since  the  row  and  coluan  aarglnals  of  x*(lj)  are  now  tha  saae  as  the 
observed  values,  no  further  Iterative  adjustment  la  necessary.  For  fitting 
a  2x2  table  to  externally  specified  aarglnals  see  Ireland  and  fullback, 
1968b  or  Fisher's  2x2  table  In  the  examples. 

The  representation  of  tha  log-linear  regression  for  the  coaplete 
nodal  Is  given  In  Figure  2.  The  entries  In  the  columns  t^,  Tj,  t3 
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i  J 

L 

T1  T2 

T3 

1  1 

1 

1  1 

1 

1  2 

1 

1 

2  1 

1 

1 

2  2 

1 

Figure  2 


ere,  respectively,  the  values  of  tha  functions  T^(ij) ,  T^ClJ) ,  T^(1J) 
associated  with  the  aerglnals  x(l.)>  x(.l),  x(ll),  end  the  coluen  heeded 
L  corresponds  to  the  no  realising  factor. 

We  note  the  Interpretation  of  Figure  2  as  the  leg-linear  relations 


*•  ■ «•  ♦  Ti +  t2  ♦  t3 


(8) 


tn  Sgil  .  L  +  T, 
nii  4 


on 


Free  (8)  we  find 


in  (x(22)/n/4), 
in  (x(12)/x(22)) , 
in  <x(21)/x<22)), 
in  (x(ll>x(22)/x(12)x(21) , 


or 
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(  ■  in  x(12)  -  in  x(22), 

(9)  1  t2  -  in  x(21)  -  in  x<22), 

I  t3  -  in  x(ll)  ♦  in  x(22)  -  in  x(12)  -  in  x(21). 

The  design  matrix  T  Is  the  natrlx  of  Figure  2,  that  is. 


Define  the  diagonal  matrix  D  with  main  diagonal  the  elements  x(lj), 
in  lexicographic  order,  that  la. 


then  the  estimate  of  the  covariance  matrix  of  x(l.)»  x(.l),  x(ll),  for  the 
observed  contingency  table  is  S^2  where 


-22.1  "  -22  ”  -21  -11  -12  » 


and  is  1  x  1,  J>22  la  3  x  3,  ■  S^2  is  1  x  3.  It  la  found  that 
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22.1 


ZsiLMlA  I(n)  .  »<!•)«(•»  ■«)»«•)  ^ 

I(U)  .  ■ll.M.M  x(.l)»(.2)  »(11)»(.2) 


«q»»(  I(ll) .  > hm 

n  n  / 


and  the  Invars*  matrix  la 


/  ifl2T+i 


r1 

*22.1 


1J2J 


1 

^?22T 


1 

i?22T 


.  1  1 
V  OTT  ■  *(22) 


1  ,  1 
x?2iy  x?22T 


1  1 

xTiiy  "  XT22T 


1  1 
7m  ~  7(22) 


1  1 
xTnT  "  x(22) 


1.1.  1  .  1 

x(ll)  *{12V  x(21)  z(22)j 


1 

I  < 

1 

r 

► 

f 


> 


I 

I 

I 


Tha  matrix  j  la  tha  covarlanca  matrix  of  tha  r'a  in  (9). 

Similar  rasults  hold  In  ganaral  and  for  astlmatad  tablas  (Fullback,  1959). 

Not*  that  tha  valuo  of  tha  logarltbai  of  tha  croaa-prodnct  ratio,  a 
maasura  of  association  or  Interaction,  appsara  In  tha  course  Of  tha  analysis 
aa  tha  value  of  for  tha  observed  valuas  x(ij) .  For  x*(lj) ,  the  estimate 
under  the  hypothesis  of  Independence,  tha  representation  as  In  Figure  2 
does  not  Involve  the  last  column,  since  x*(ij)  la  obtained  by  fitting  the 
one-way  marginals,  and  xy«0. 

a,  » 

The  log-linear  relations  for  the  estimate  x  (1J)  are 


nw 


L  +  T1  +  T2 


(10) 


In  *  .Op.  m  L  +  t, 

Qir  x 

nw  & 


la  x  ^  ■  L  , 
nw  ’ 
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where  the  numerical  valuea  of  L,  ,  T  In  (10)  nust  of  courae  depend  on 
* 

x  and  differ  from  the  valuea  in  (b>. 

The  ninlaun  discrimination  information  atatistic  to  test  the  null 
hypothesis  or  modal  of  indapendence  is  2l(x:x*)  with  one  degree  of  freedom. 
In  this  case  the  quadratic  approximation  is 


(11) 


2l(x:x  )  (x(ll) 


sihlsLH)/  i  |  i  |  i  |  i  \ 

“  (x*<ll)  x*(12)  x*(21)  x*(22)  j 


Remembering  that  x  (lj)  -  x(l.)x(. J)/n,  the  right-hand  side  of  (11)  may 
also  be  shown  to  be 

(12)  X2  -  E  (x(ij)  -  x(i.)x(.j)/n)2/  SlLteLll  , 


2 

the  classical  X  -test  for  independence  with  one  degree  of  freedom.  Another 
test  which  has  been  proposed  for  the  null  hypothesis  of  no  association  or 
no  interaction  in  the  2 r2  table  is 

(In  x(ll)  +  in  x(22)  -  in  x(12)  -  in  x(21))2  ^^)+Jt7i2)+xT5r)+x'(fe^  * 

which  may  be  shown  to  be  a  quadratic  approximation  for  2l(x:x*)  in  terms  of 

Tj  with  the  covariance  matrix  estimated  using  the  observed  values  and  not 

the  estimated  values.  We  remark  that  if  the  observed  values  are  used  to 

2 

estimate  the  covariance  matrix  than  Instead  of  the  classical  X  -test  in 
(12)  there  is  derived  the  Meyman  modified  chi-square 

X^  •  E  (x(ij)  -  x(i.)x(.J)/n)2/x(iJ). 
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5.  The  2x2x2z2  Table 

A  uaaful  graphic  rapraaantatlon  of  tha  log-llnaar  rcgraaaloo  (5) 
la  glvan  In  Figure  3  for  a  2x2x2\2  contingency  tabla.  Thla  la  the  analogue 
of  the  deeign  eetrlx  in  noraal  regression  theory.  The  blank  spaces  In 
Figure  3  represent  aero  values.  The  (ljkt) -coluans  are  the  cell  Identifi¬ 
cations  In  the  seas  lexicographic  order  as  the  cell  entries  for  the 
estlaetes  In  the  coaputar  output.  Coluan  1  corresponds  to  L  which  Is  the 
noraallslng  factor.  Bach  of  the  coluans  2  to  16  represents  the  corresponding 
values  of  f.he  T(w)  functions,  coluans  2  to  3  those  for  the  one-way  aarglnals, 
coluans  6  to  11  those  for  the  two-way  aurglnals,  coluana  12  to  15  those  for 
the  three-way  aarglnals,  and  coluan  16  that  for  tha  four-way  Barg  Inal. 

The  tau  paraaeter  associated  with  the  T(u>)  function  Is  given  at  the  head 
of  the  coluan.  The  superscripts  ere  useful  identifications.  The  coaplete 
representation  with  ell  the  coluans  of  Figure  3  generates  the  observed 
values.  Thus  the  rows  represent 

*•  nwiijS  ■  1  +  +•••♦  TUTU<«W> 

+...+  +...+  > 

where  x(ljki)  In  the  2x2x2x2  ease  Is  1/2x2x2x2  and  the  nuaerlcal  values 
of  L  and  the  tans  depend  on  the  observed  values  x(ljkt).  The  design 
natrlx  corresponding  to  an  estimate  uses  only  those  coluans  associated 
with  the  aarglnals  explicit  and  Implied  in  tha  fitting  process.  This  Is 
a  reflection  of  the  fact  that  higher  order  aarglnals  iaply  certain  lower 
order  aarglnals,  for  exsaple,  the  two-way  aarglnal  x(lj..)  Implies,  by 
suaaetlon  over  1  and  j,  the  one-way  aarglnals  x(.j..),  x(l...),  and  the 


43 


I 

i 


6 

7 

8 

9 

10  11 

12 

13 

14 

15 

ij 

11 

Tlk 

T11 

T11 

rJk 

T11 

Tkt 
Ul  T11 

T1Jk 

Tlll 

ijl 

Tlll 

1U 

Tlll 
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TU1 
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2  12 
2  2  1 
2  2  2 


Figure  3.  Graphic  representation 


total  n«*x(....).  The  representation  for  the  unlforn  distribution 
corresponda  to  coluan  1  only.  The  estimate  x*(ijkl)  based  on  fitting 
the  one-way  aarglnals  will  use  only  columns  1-5.  The  ▼slues  of  L  and 
the  taus  for  this  estlaate  will  be  different  frcei  those  for  x(ijkl)  and 
depend  on  the  estlaate  x*(ijki) .  The  representation  In  Figure  3  Implies 
for  xf(ljkl) 


*l^m)  i  1  k  i 

*I<U12)  .  .  k 

*°  ~ts — i+Ti+Ti+Ti 


*!<2222) 

to - - - -  L  . 

nw 


Th«  eatiaate  Xgdjkl)  baaed  on  fitting  the  two-way  Marginal*  will  use 
coluana  1-11  alnca  tha  two-way  aarglnala  alao  laply  tha  ona-way  aarglnala. 
Tha  valuta  of  L  and  tha  taua  for  thla  aatlnata  will  ba  dlfferant  froa 
thoaa  for  tha  obaarvad  valuaa  or  othar  aatlaataa  and  dapand  on  tha  valuta 
^  of  tha  aatlaata  x*(ijkl).  For  tha  aatlnata  fitting  tha  two-way  aarglnala 
tha  rapraaantatlon  In  Flgura  3  lnpllaa 


In 


**(1111) 

“‘~'ni 


L  +  ^  ♦  Tx  +  T!  +  Tx  +  Tu  +  Tu  +  Tu+Tu  +  Tu  +  Tu 


In 


**(1U2) 

mr 


L  +  Ti  +  Tx  +  T!  +  Tu  +  Tu  +  Tu 


In 


*2 (2222) 

nw 


■  L  . 


a 

Tha  aatlaata  z^(ljkl)  baaad  on  fitting  the  thraa-way  aarglnala  will 
uaa  coluana  1*15  alnca  tha  thraa-way  aarglnala  alao  laply  tha  two-way  and 
ona-way  aarglnala. 

Note  that  In  tha  graphic  rapraaantatlon  In  flgura  3  wa  aat  all  taua 
with  aubacrlpt  1-2  and/or  j-2  and/or  k-2  and/or  1-2  equal  to  taro,  by 
convention,  to  lnaura  linear  Independence. 
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The  analysis  of  Inforaatlon  table  correapondlng  to  the  hierarchical 
fitting  of  x*(ijki),  XjdJki),  x*(ijk£)  la  shown  in  table  2. 


TABLE  2 

ANALYSIS  OF  INFORMATION 


Component  due  to 

Inforaatlon 

D.F. 

All  one-way  marginals 

2I(x:xJ) 

11 

All  two-way  marginals 

IKxjtx*) 

6 

2l(x:x*) 

5 

All  three-way  aarglnals 

2I(x*:x*) 

4 

2I(x:x*) 

1 

2I(x:*1)  testa  the  null  hypothesis  that  the  eleven  taus  of 
coluatns  6-16  are  equal  to  taro. 

21 (x* :x*)  tests  the  null  hypothesla  that  the  six  taus  of  coluans 
6-11  are  equal  to  aero. 

2I(x:x^)  tests  the  null  hypothesis  that  tha  five  taus  of  coluans 
12-16  are  aero. 

2I(xj:x*)  tests  tha  null  hypothesis  that  the  four  taus  of  coluans 
12-15  are  aero. 

2I(x:x^)  tests  the  null  hypothesis  that  the  tau  of  column  16  is 


aero. 


In  the  examples  wo  shall  see  other  tests  on  the  interaction 
paraaetars  (Kullback,  1974).  We  now  consider  a  nuaber  of  exaaples  to 
illustrate  more  specifically  various  aspects  of  the  analysis. 


/ 
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6.  Algorithms  to  calculate  quadratic  approximations. 

He  now  present  algorithms  to  calculate  quadratic 
approximations  to  2I(x:x*),  2I(x£:x*),  2I(x*:x). 

cl  D  a 

1.  21 (x:x*) . 

a 

a)  Compute  x*. 

b)  Using  the  T  design  matrix  corresponding  to  x  (including 
the  L  column) ,  compute  the  matrix  S  *=  T'D*T,  where  D*  is 

*  “w”  Cl 

a  diagonal  matrix  whose  entries  are  the  values  of  x*  in 
the  same  order  as  for  the  rows  of  the  T-matrix. 


hi 

hi) 

c)  Let 

S  - 

[hi 

hij 

,  where  is  a  lxl  matrix. 

-1 

then 

-22.1 

"  -22 

-  hi 

—12  ’ 

d)  Compute  Sjj  ^ 


e)  Consider  the  marginals  which  do  not  enter  into  the 

specification  of  x*,  and  let  d1  be  a  one  row  matrix  whose 
entries  are  the  differences  between  the  set  of  marginals 
just  considered,  in  the  x  and  x*  tables. 

cl 
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f)  Let  B  be  that  submatrix  of  ^  whose  rows  and  columns 
correspond  to  the  t  columns  of  the  design  matrix  associated 
with  the  set  of  marginals  in  step  e) . 

g)  Compute  d 1 Bd. 

This  is  the  "marginals"  approximation  to  2I(x:x*). 

a 

h)  Compute  the  set  of  t's  associated  with  the  marginals 
considered  in  e)  for  the  x  distribution,  and  call  the  one 
row  matrix  of  these  t's  t_'  . 

Compute  x_'  B  where  B-1  is  the  inverse  of  the 
matrix  B  in  f )  . 

t/B  is  the  "tau"  approximation  to  2I(x:x*). 

i)  The  "marginals"  approximation  is  also  equal  to 
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2. 


> 


21<x*:xj> 

a)  Compute  x£,  x*. 

0)  Using  the  T  design  matrix  corresponding  to  x£  (including 
the  L  column)  ,  compute  the  matrix  S  «  T'D*T,  where  D* 
is  a  diagonal  matrix  whose  entries  are  the  values  of  x* 

a 

in  the  same  order  as  for  the  rows  of  the  T-matrix. 


c)  Let  S 
-22.1 


,  where 


is  a  lxl  matrix, 


then 


d)  Compute  S^2  l  * 

e)  Consider  the  marginals  which  enter  into  the  specification 

of  x*  but  not  in  x*,  and  let  d"  be  a  one  row  matrix  whose 
d  .  a  — 

entries  are  the  differences  between  the  set  of  marginals 
just  considered  in  the  x£  and  x*  tables. 


54 


21 


f)  Let  B  be  that  submatrix  of  S~2  i  wllose  rows  and  columns 
correspond  to  the  t  columns  of  the  design  matrix  associated 
with  the  set  of  marginals  in  step  e) . 

g)  Compute  d 1 Bd 

This  is  the  "marginals"  approximation  to  2I(x*:x*). 

h)  Compute  the  set  of  t's  associated  with  the  marginals 
considered  in  e)  for  the  x£  distribution  and  call  the 
one  row  matrix  of  these  t's  t/  . 

Compute  t'B  where  B  1  is  the  inverse  of  the  matrix 
B  in  f) . 

jr'B  the  "tau"  approximation  to  2I(x£:x*). 

i)  The  "marginals"  approximation  is  also  equal  to 
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3.  21 (x* :x) . 

a)  Using  the  T  design  matrix  corresponding  to  x*  (including 
the  L  column),  compute  the  matrix  S  =  T'D  T,  where  D  is 

—  —  — x —  — X 

a  diagonal  matrix  whose  entries  are  the  values  of  x  in 
the  same  order  as  for  the  rows  of  the  T-matrix. 

/  -11  -12  \ 

b)  Let  S  ■  _  e  1  ,  where  S. ,  is  a  lxl  matrix,  then 

"  ^-21  -22/  _li 

-22.1  "  -22  "  -21  -11  -12 • 

c)  Compute  s”^  ^  • 

d)  Let  d'  be  a  one  row  matrix  whose  entries  are  the  differ¬ 
ences  between  the  l  T(w)x*(w)  and  l  T(io)x(u>).  In  the 

0)  w 

case  when  x*(<u)  is  specified  by  conditions  external  to 

the  observed  values,  the  value  of  l  T(w)x*(u>)  is  specified 

0) 

without  having  to  compute  x* (w) . 
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e)  Compute  d'S^  j^d. 

This  is  the  approximation  to  2l(x*:x).  Note  that 
this  can  be  obtained  without  computing  x*. 

f)  The  approximation 

y 

L  X 

requires  the  prior  computation  of  x*. 
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4.  Applications 


In  this  chapter  vo  conaldar  sight  sxaaplss  illustrating  various 
aapsets  of  tha  aodsl  fitting  aathodology  by  tbs  analysis  of  real  data. 

Exaaple  1.  Classification  of  ■ultlvariats  dlchotoaous  populations. 

This  exaaple  illustrates  tha  analysis  of  a  fiva-vay 
2z2z2x2x2  contingency  table.  It  introduces  the  uaa  of 
log-odds  or  logit  representation,  and  tha  ■ultlpllcatlva 
version  of  the  odda  as  a  product  of  factors.  It  also 
illustrates  the  interpretation  of  tha  paraaeters,  and  the 
effect  of  interaction  on  the  nuaerlcal  value  of  the 
association  betveen  classifications.  It  considers  several 
aodals  with  respect  to  the  aarginals  fitted,  the  design 
aatrlces,  and  the  detailed  hierarchical  analysis  of 
lnforaation. 


58 


An  Example  of 

Multiway  Contingency  Table  Analysis  Applied  to  the  Classification 


cf  Multivariate  Dichotomous  Populations 

Introduction 

Multiway  contingency  tables,  or  cross-classifications  of  vectors  of 
discrete  random  variables,  provide  a  useful  approach  to  the  analysis  of 
multivariate  discrete  data.  In  the  particular  application  we  consider, 
the  individual  variates  are  dichotomous  or  binary.  Mote  however  that  the 
procedures  and  analysis  are  not  restricted  to  dichotomous  or  binary  data 
but  are  also  applicable  to  polychotomous  variates. 

For  background  on  the  study  and  problem  leading  to  the  data  we 
consider  see  Solomon  (1960).  In  Ku  et  al.  (1969)  minimum  discrimination 
information  procedures  were  applied  to  problems  of  multivariate  binary 
data  in  Information  systems,  such  as  conmunlcation,  pattern  recognition, 
and  learning  systems.  In  Cox  (1972)  there  is  a  review  of  methods  and 
models  for  the  analysis  of  multivariate  binary  data  and  Solomon's  data  is 
given  as  a  typical  example.  Martin  and  Bradley  (1972)  developed  a  model 
based  on  a  set  of  orthogonal  polynomials  and  applied  it  to  Solomon's  data. 
We  remark  that  our  procedure  based  on  the  principle  of  minimum  discrimina¬ 
tion  Information  estimation  applied  to  the  analysis  of  multiway  contingency 
tables  yields  a  result  practically  equivalent  to  that  of  Martin  and  Bradley 
(1972).  Goodman  (1973)  discusses  Solomon's  data  in  relation  to  methods  for 
selecting  models  for  contingency  tables. 
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Solomon's  Data 


A  total  of  2982  high-school  seniors  were  given  an  attitude  question¬ 
naire  to  assess  their  attitude  towards  science.  The  students  were  also 
classified  on  the  basis  of  an  IQ  test  into  high  IQ,  the  upper  half,  and 
low  IQ,  the  lower  half.  The  sixteen  possible  response  vectors  to  each  of 
four  agrec-disagree  responses  were  tabulated.  The  problem  of  Interest  was 
to  determine  whether  the  response  vectors  could  be  used  as  a  basis  for 
classifying  the  students  into  one  of  two  classes  and  evaluate  possible 
classification  procedures. 

Contingency  Table  Analysis 

We  shall  treat  the  data  given  in  Table  1  as  a  five  way  2x2x2x2x2 
contingency  table,  denoting  the  original  observations  by  x(hljkJl) ,  where 


Characteristic 

Index 

1 

2 

IQ 

h 

low  IQ 

high  IQ 

Response  1 

i 

disagree 

agree 

Response  2 

J 

disagree 

agree 

Response  3 

k 

disagree 

agree 

Response  4 

l 

disagree 

agree 

As  a  first  overview  of  the  data  to  determine  the  marginals  and 
their  related  Interaction  parameters  which  may  furnish  significant  values 
in  the  log-linear  representation  of  the  exponential  family  of  the  estimates 
obtained  by  Iterative  scaling  fitting,  we  list  in  Table  2a,  Analysis  of 
Information,  a  sequential  hierarchical  study  of  interaction  and  effect  type 
measures  Kullback  (1970),  Ku  et  al.  (1971). 


The  first  c*.sti;nate  we  start  with  is 


x*(hiJk.J.)  -  x(h*  •  •  *)x(  •  J JkH)/n 
a 


since  the  minimum  discrimination  information  .statistic  (interaction  type 
measure) 

2I(x;x*)  -  2ZZZZZ  x(hijkHHn  x.U.ii|kg.)n 

a  J  x(h**  •»)x(*ijk)l) 


tests  a  null  hypothesis  that  the  IQ  groupings  are  homogeneous  over  the 

sixteen  response  vectors  kullback  (1959,  Chap.  3).  This  null  hypothesis 

is  rejected  and  the  subsequent  study  of  effect  and  Interaction  type 

measures  is  an  attempt  to  find  a  good  fit  to  the  data  and  account  for  the 

total  variation  as  measured  by  2l(x;x*).  Although  the  association  between 

3 

IQ  and  the  response  to  the  first  statement  as  measured  by  2l(x*:x£)  ■  2.376, 
1  D.F.,  is  not  significant,  it  was  decided  to  examine  in  detail  the  estimate 
x*('a ijkfi.)  whose  numerical  values  are  given  in  Table  1.  It  may  be  shown  that 


2I(x*:x*) 


2VL 


x(hl«")fcn 


x(h*  • • »)x( •!• •  •) 


and  tests  a  null  hypothesis  that  IQ  is  homogeneous  over  the  response  to 
the  first  question.  The  estimate  x*(hijk£)  was  selected  because  the 
interaction  type  measure,  2l(x:x*)  ■  16.307,  11  D.F.,  represents  an 
acceptable  fit,  the  estimate  is  symmetric  with  respect  to  the  four  state¬ 
ments,  and  is  comparable  to  the  first-order  model  estimate  of  Martin  and 
Bradley  (1172),  whose  values  are  also  listed  in  Table  1. 

From  the  design  matrix  or  log-linear  representation  in  Fig.  1,  we 
obtain  the  parametric  representation  for  the  log-odds  (low  IQ/high  IQ) 
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in(x*(liJkl)/x*(2ijU)) 


over  the  sixceen  response  vectors  as  given  in  Table  3a.  Thus,  for  example 


x* (11111) 

Zn  x* (21111)  "  T 


hk  ,  hi 


h  .  hi  .  hi  ,  i» «>.  ,  i 

i +  Tn  +  Tn +  Tn +  Tn  * 


that  is,  a  linear  regression  of  the  log-odds  in  terms  of  a  constant  T ^ 
and  the  main  effects  of  each  component  of  the  response  vector,  namely, 

T^.  The  numerical  values  of  the  log-odds  and  the  parameters 
are  easily  obtained  from  the  entries  in  the  computer  output  and  are  also 
given  in  Table  3a.  It  is  clear  that  the  odds  may  be  expressed  in  a  multi¬ 
plicative  model.  The  odds  and  the  odds  factors  are  easier  to  appreciate. 
From  the  log-odds  representation  above  we  find 


XMlllll)  .  .  .  .  .  .  ,  Lfl 

x~*T21~m)  "  exP<Ti>  «*?(TU)  exp(Tjj)  expOrJp  expCrJp 


and  from  the  values  In  Table  3a  have 


1.237  -  (.682) (.816) (1.132) (1.406) (1.396). 


We  note  from  Table  3a  that 


xMlijkl)  x*(lijk2)  M 

*n  x*(2ijkl)  "  ln  x*^2ijk2)  "  T11  "  0,3338  * 
6  6 


that  is,  a  change  from  disagree  to  agree  on  the  fourth  statement  Is 

associated  with  an  Increase  of  0.3338  ln  the  log-odds  (low  IQ/high  IQ), 
hi 

Note  also  that  represents  the  association  between  IQ  and  response  to 
the  fourth  statement  as  measured  by  the  log-cross-product  -  ratio  (log 
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relative  odds) 


-  A  - 


hl  x*(lijk).)x*(2ijk2) 

T11  "  ?n  x*(2ijkl)x*(lljk2)  ’ 
e  c 

and  is  rlie  same  for  all  eight  levels  of  the  responses  to  statements  one, 
two  and  three. 

Similarly,  it  is  found  that 

x*(lijU)  x*<lij2i) 

£n  x*(2ijH)  '  £n  x*(2ij2£)  “  T11  "  0,3411  » 

c  e 

x*(lilk£)  x*(112k£) 

*n  x*(2ilk£)  "  in  x*(2i2k£)  “  T11  “  0,1240  » 

e  e 

x*(lljk£)  x*(12jk£)  h 

ln  x*(21jk£)  “  ln  x*(22Jk£)  *  T11  “  ~0,2n3°  • 
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Classification 

Since  xU****)  ■  x*(l****)  ■  1491,  and  x^****)  ■  x*(2»**v)  ■  1491, 
e  e 

we  assign  a  response  vector  (ijkf.)  to  the  region 

E^.  classify  as  population  h*l  (low  IQ),  when 

x*(lijki) 

£n  x*<2ijkl)  -  0 
e 


and  to  the  complementary  region 

E^;  classify  as  population  h-2  (high  IQ),  when 

x*(lijk£) 

£n  x*(2ijki)  <  °  ' 


If  we  set 


l 

(iJkl)cE. 


xMlijkJl) 

c 

1491 


m2(Ei) 


l 

(ijkf.)eE. 


x*(2ijki) 

1491 


then  the  probability  of  error  of  the  classification  procedure  is  (Kullback, 
1959,  pp.  4,  69,  30), 

Prob  Error  «  pp2(Ej)  +  qy^E^  -  (y^E^)  +  p1(E2))/2 
since  here  p  ■  x(2»,,#)/2982  -  y  ,  q  ■  x(l* •• *)/2982  ■  ~  . 


! 

I 

\ 

i 
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The  relevant  computations  with  x*(hijkH)  are  given  in  Table  4(b) 
and  show  that  the  Prob.  Error  ■  0.444.  The  corresponding  computations 
with  the  original  data  x(hikj£)  are  given  in  Table  4(a)  and  yield 
Prob.  Error  ■  0.441. 


Other  Estimates 

In  viev;  of  the  measure  of  the  effect  of  the  marginal  x(hi  •  •  l)  (a  id 

the  associated  interaction  parameters)  in  Table  2a,  2I(x*:;;*)  -  4.316, 

ra  g 

1  I). F. ,  and  the  marginal  x(h«j*J,),  2I(x*:x*)  •  3.181,  1  D.F. ,  the  m.d.i. 

p  n 

estimate  x*(hijk£)  fitting  the  marginals  x(*ijkf,),  x(h*j**),  x(h»*k»)» 

x (hi *  * 2.)  and  the  m.d.i.  estimate  x*(hijk&)  fitting  the  marginals  x(*ljk£), 

w 

x(h**k*),  x(hi,,Jl),  x(h*j*i,)  were  computed.  The  estimates  are  given  in 

Table  1  and  the  relevant  analysis  of  information  given  in  Table  2b. 

The  values  of  the  log-odds,  parametric  representation,  and  the 

associated  Interaction  parameters  are  given  in  Table  3b  for  x*(hijk£)  and 

in  Table  3c  for  x*(hijk2).  Uote  from  Table  3b  that 
w 


X*(lljkl)  XM11JU2) 

x*(21jkl)  '  ln  x*(21jk2)  "  T11  +  Tlll  "  °'6A6Q  » 


2n 


x*(12jkl)  x*(12jk2)  .. 

x*(22Jkl)  "  *n  x*(22jk2)  "  T11  "  °-26RO  • 


xMlljkl)  XM12JU1)  h  hU 

4n  x*(21Jkl)  ■  2,11  x*(22jkl)  "  Tn  +  Tlll  “  "°*0276  * 


x*(lljk2)  **(12Jk2)  hi 

ln  x*(21jk2)  "  *n  x*(22jk2i  "  T11  "  “°*4065  * 


reflecting  the  interaction  of  the  responses  to  the  first  and  fourth 
statements. 


65 


-  7  - 


From  Table  3c,  it  is  found  for  example,  that 


.  .  x:aiik2)  h» .  hu .  w  . 

1 "  x*(211kl)  '  10  X*(211k2)  ’  T11  +  Tlll  +  Tlll  ‘  °'580t  ’ 

w  w 

.  xS<121kl>  .  xS<121k2>  hi  .  MI  . 

8,0  x*(221kl)  “  *n  x*(221k2)  “  T11  Tlll  “  0,2030  * 


x*(112kl)  xM112k2)  .5  .  . 

in  x*<212kl)  “  40  x*(212k2)  “  T11  +  Tlll  “  °‘9371 


x*(122kl)  x*(122k2)  hft 

£n  x*<222kl)  _  ln  x*(222k2)  "  T11  "  0,5595  » 


reflecting  the  Interactions  of  the  responses  to  the  first,  second  and 
fourth  statements. 

The  computation  of  the  probability  of  error  using  the  estimates 
x*(hijk£)  and  x*(hijk£)  is  shown  in  Table  4c  and  4d  respectively,  and 
yields  probabilities  of  error  0.444  and  0.446. 


Remark 

Martin  and  Bradley  (1972)  examined  Solomon's  data  ln  terms  of  an 
estimate  they  called  a  first-order  or  linear  model.  These  estimated 
values  are  given  in  Table  1.  It  turns  out  that  although  the  underlying 
approaches  are  different,  the  Martin  and  Bradley  parameters,  their  a^, 
and  estimates  are  practically  the  same  as  those  for  x*(hljk  ).  From 
Martin  and  Bradley  (1972,  pp.  216-217)  we  note  that 

.  1<12222>  h  .  1+*0+V*2+,3+*4 
ln  x*  (22222)  -  TI  *  ,n  l-VVVVN  ’ 


66 


-fl¬ 


irt 


in 


in 


x*(12221) 

e 

it 

T 

4. 

hi 

T  9 

■  in 

1+aO+al+a2+a3*a4 

x*(22221) 

o 

T1 

T 

T11 

1_*<fVV*3+\  • 

x* (12212) 
c 

It 

4- 

hk 

■  in 

1+Wa2"03+114 

^*(22212) 

c 

T1 

T 

T11 

1"W*2+Va«  ’ 

x*(12122) 

e 

h 

T 

+ 

Thj 

■  in 

1+WW\ 

x*(22122) 

e 

T1 

Tl] 

x*(11222) 

e 

h 

4. 

TUi 

-  in 

1+WW'\ 

x* (21222) 
e 

T1 

T 

T11 

‘‘WWi  ' 

or  to  a  first  approximation  of  the  logarithm 

T1  "  2a0+2al+2a2+2a3+2a4  * 

T1  +  T11  “  2ao+2al+2a2+2a3‘2a4  » 

T1  +  T11  ‘  2V2ai+2a2-2a3+2a4  • 

T1  +  T11  *  2a0+2al”2a2+2*3+2a4  ’ 

T1  +  T11  "  2ao”2al+2a2+2a3^"2a4  ’ 
It  is  found  that 


hi 

11 

hk 

lll 


-Aa4  , 


-*a3  * 


hj 

T  J  ■ 
T11 


-*a2  » 


rhi 

rll 


-4a^  . 
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The  values  of  the  parameters  given  by  Martin  and  Bradley  (1972,  Table  3, 
p.  217)  are 

a_  ■  -0.042,  a,  ■  0.049,  a«  ■  -0.031,  a,  ■  -0.084,  a.  ■  -0.082 

0  1  2  3  4 

so  that 

hi. 

-  0.3338  -  0.334,  -4aA  -  0.328  , 

T11  “  °*3411  “  0.341,  “4a3  “0*336  , 

tJ3  «  0.1240  -  0.124,  -4a2  -  0.124  , 

T11  "  -0*2030  “  “0*203,  -4a1  -  -0.196  . 

The  computation  for  the  probability  of  error  using  the  estimates 
are  shown  In  Table  4e  and  yields  a  probability  of  error  0.445.  (Martin 
and  Bradley  give  a  value  of  the  risk  as  0.455). 
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Soloaon'  a  Data  "Classification  Procedures 
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Figure  1 

Tabic  2a 

Annlyii  Is  of  Information 


Marginals  Fitted 

a)  x(.ijk4),x(h....) 

b)  x(.ljki);x(hi...) 


c)  x(«lJki)jX(hi<»*)jX\h«J«i) 


e)  x(  .ijlci),x(hi...)>x(h.j-  •)>*(b..k.)#x(b...i) 


f)  x(  •ijkf)ix(b..k.)/x(h...l),x(hlj<.) 


g)  x(  .ijki),x(h...i),x(hij. . )#x(bl.k.) 


m)  x(  »i  Jkl),x(hlJ . .  )>x(hi  .k .  ),x(hi .  .1 ) 


n)  x(  .ijki),x(hij..)>x(hi.lc.)/x(hi..i),x(h.jk.) 


p)  x(  .ijki),x(hij. •  ),x(hi .k. ),x(hi< •i);x(b<jki)|X(b>Jii) 


q)  x( . ijkli  ),x(bij  ..)xx(hi.k.)xx(hi . J ),x(b . Jk.  ),x(h. J .  J ), 

x(h..kl) 


r)  x(  .ijk*),x(hi..i),x(h.j  ./)>x(h..kl),x(hijk.) 


Analysis  of  Information  (continued) 


TfS 


Marginals  Fitted 

Information 

D.F. 

2l(x:xj)  *  4.204 

4 

s)  x( .ijkf),x(b..ki),x(hijk.)>x(bij.f ) 

2l(x*:x*)  =  2.303 

6  r 

1 

2l(x:x*)  »  1.901 
s 

3 

t )  x(  .ijki ),x(hijk.  ),x(hij  .1  ),x(bi .ki ) 

2l(x*:x*)  -  1.375 
t  s 

1 

2l(x:x*)  =  0.526 

2 

u)  x(  .ijkf  ),x(taijk.),x(hij  .i),x(bi.k/  ),x(h.Jki) 

2l(x£:x*)  -  0.361 

1 

2l(x:x*)  ■=  0.165 

1 

Table  2b 

Analysis  of  Information 


Marginals  Fitted 

Information 

D.F. 

e)  x( .ijki),x(hi...),x(h.j..),x(h..k.),x(h...l) 

2l(x:xj) 

=  16.307 

11 

v  x(  .ijkf  ),x(b.J..),x(h..k.),x(hi..f ) 

2l(x*:x£) 

-  3.735 

1 

2l(x:x^) 

*=  12.572 

10 

w)  x( .ijk/),x(h..k.)>x(hi../),x(h.J.A) 

=  3 .443 

1 

2I(x:jc») 

w 

=*  9*129 

9 

2 


xjOiji.*) 

Log-odds 

in 

e 

Parametric 

representation 

log-odds 

+T 

11 


+ThJ 

^11 

11 

^11 

^11 


^11 


-0.1210 

-0.1284 

-0.4621 

0.0888 

-0.2450 

-0.2524 

-0.5861 

0.4158 

0.0820 

0.0746 

-0.2592 

0.2918 

-0.0420 

-0.0494 

-0.3831 


0.3831,  =  -0.2030,  tJJ  -  0.1240 

0.3411,  -  0.3338 


Table  3& 


15 


Log -odd:'. 


x*(lljfc2> 

;,n 


ijk/ 

j  Parametric  representation 

log-odds 

1111 

1  h 

T1 

^  bi 
+  fn 

11 

x  hk 
+T11 

hi 

"11 

.  hii 
**111 

0.3571 

1112 

b 

T1 

+Tn 

H-Thk 

11 

-0.2898 

1121 

h 

T1 

"SS 

*  bi 

"11 

+Thiil 

111 

0.0115 

1122 

h 

T1 

"SS 

"Si 

-0.6355 

1211 

b 

T1 

< 

"SS 

hi 

"11 

+  Tbii! 

Ill 

0.2366 

1212 

A 

< 

«s 

-0.4101 

1221 

A 

"SS 

"Si 

+tnu 

111 

-0.1088 

h 

Ti 

< 

-0.7557 

2111 

A 

"Si 

"?s 

"SS 

0.3847 

2112 

A 

tTSi 

"SS 

0.1167 

2121 

A 

„W 

11 

"SS 

0.0390 

2122 

h 

T1 

+TW 

Tll 

-0.2290 

A 

"SS 

"SS 

0.2644 

2212 

A 

"SS 

-0.0036 

2221 

h 

T1 

^s 

-0.0813 

2222 

b 

T1 

-0.3492 

^  =  -0.3492,  =  -0.4065,  T\[  =  0.1203 

^  =  0.3457,  -'ll  =  0.2680,  tJ**  =  0.3789 


Table 
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Martin  and  Bradley 


E1 

xUijW) 

x(2ijk/) 

1111 

74.67 

60.33 

1211 

12.02 

10.98 

2111 

314.50 

207.50 

2112 

193.45 

178.55 

2121 

259.17 

240.83 

2211 

37.74 

891.55 

28.26 

726.45 

M2(E1> 


726.45 
1&91  ' 


^(E2)  = 


1491-891.55 

1491 


Prob .  Error 


1  726.^^99.45 

2  1491 


152^ 

2982 


-  0.445 


Table  4(e) 


Example  2.  Leukemia  death  observation  at  ABCC.  This  example 

llluetretes  the  analysis  of  a  three-way  5x6x2  contingency 
table.  It  illustrates  the  estiaatlon  procedure  for  the 
hypothesis  of  no  second-order  interaction.  It  also 
illustrates  the  use  of  a  cell,  other  than  the  last  one, 
as  the  reference  cell.  Details  of  the  computation  of  the 
covariance  matrix  of  a  set  of  estimated  parameters  of 
Interest  is  given.  Confidence  intervals  for  the  parameters 
are  computed  using  the  multiple  comparison  leanw. 
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The  Analysis  of  Leukemia 
Death  Observation  at  ABCC 


Sugiura  and  Otake  (1974)  have  considered  the  analysis 
of  k  2 xc  contingency  tables  and  have  applied  their  pro¬ 
cedures  to  the  data  in  Table  I.  We  propose  to  apply 
the  minimum  discrimination  information  estimation  and 
associated  concepts  to  the  analysis  of  the  data  in  Table 
I.  We  denote  the  occurrences  in  the  three-way  contin¬ 
gency  Table  I  by  x(ijk)  with  the  notation 


Variable 

Index 

1 

2 

3 

4 

5 

6 

Age 

i 

0-9 

10-19 

20-34 

35-49 

50+ 

Dose 

j 

Slot  in  city 

0-9 

10-49 

50-99 

100-199 

+ 

o 

o 

(N 

Mortality 

k 

Dead 

Alive 

We  get  the  minimum  discrimination  information 
estimates  fitting  the  sets  of  marginals 

a)  x(ij  . )  ,  x(..k) 

b)  x(ij . )  ,  x(i.k) 

c)  x(ij.),  x(i.k) ,  x(.jk) 

d)  x(ij  . )  ,  x(.jk) 

We  start  with,  the  set  of  marginals  x(ij.),  x(..k)  be- 
* 

cause  xft  Cijk)  *  x(ij.)  xC.  .k)/n.  is  the  m.d.i.  or 
maximum-likelihood  estimate  under  the  null  hypothesis  that 
mortality  is  homogenous  over  the  age  by  dose  combinations. 
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Wo  summarize  the  results  in  the  Analysis  of  Information 
Table. 


P 

i 


t 


\ 

i 

\ 

i 


Analysis  of  Information  Table 


Component  due  to _ Information _ D.F. 


a) 

x(ij 

.)  , 

x(.  .):) 

21 (x:x  *) 

CL 

= 

205.983 

29 

b) 

x(i  j 

•  )  ! 

x(i.k) 

2l(xb*:xa*) 

- 

2.326 

4 

2l(xsxb  ) 

= 

203.657 

25 

c) 

x(ij 

•)  , 

x(i.k)  ,  x{. jk) 

2I(xc":xb*) 

= 

175.810 

5 

2l(x:x  ) 
c 

= 

27.847 

20 

a) 

x(ij 

.)  , 

x(.  .k) 

2l(x:x„  ) 
a 

a 

205.983 

29 

d) 

x(ij 

.), 

x(.  jk) 

*  * 

21 (x.  :x=  ) 
d  a 

= 

173.502 

5 

2 I ( x : x^  ) 

32.481 

24 

c) 

xCij 

•)  , 

x(. jk) ,  x(i.k) 

21 (xc  :xd  * 

= 

4.634 

4 

2I(x:xc*) 

S 

27.847 

20 

We  may  draw  the  following  inferences  from  the 
Analysis  of  Information  Table. 

1.  Mortality  is  not  homogeneous  over  the  age  by 

* 

dose  combinations  (2l(x:x  )  =  205.983.-  29  D.F.) 

cl 

2.  The  effects  of  age  by  mortality  are  not 
significant  (21  ;xft  )  =  2.326,  4  D.F.,  21  (xc  :xd  )  *= 
4.634,  4  D.F.) 

3.  The  effects  of  dose  by  mortality  are  highly 
significant  (2l(xc*;xb*)  =  175.810,  5  D.F.,  2l(xd*:xa*) 
173.502,  5  D.F.) 
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Since  the  value  of  2I(x:xd  )  *»  32,481,  24  D.F. 

is  not  significant  at  the  10%  level,  we  obtained  the  complete 

* 

output  for  and  the  estimates  are  shown  in  Table  lib. 

However  since  four  OUTLIER  values  were  indicated  for 

* 

xd  j  and  for  comparison  with  the  results  of  Sugiura  and 
Otake^it  was  decided  to  perform  a  more  complete  analysis 
with  the  estimate  fitting  all  the  two-way  marginals, 
that  is,  the  estimate  corresponding  to  an  hypothesis 

of  no  second-order  interaction.  This  estimate  is  given 

* 

in  Table  Ila  and  we  have  called  it  x2  (ijk)  ,  that  is, 
x2*(ijk)  =  xc*(ijk)  . 

Again  for  easier  comparison  with  the  results  of 
Sugiura  and  Otake  we  selected  the  cell  (512)  as  the 

reference  cell  so  that  the  log-linear  representation 

* 

of  x2  (ijk)  is  given  by 

X2  (ijk)  44  44  -i-S  -4-4 

in  -  L  +  T^(ijk)  +  ...+TjTj(ijk)+TJTJ(ijk)  +  ...+TJTJ(ijk)  + 

+T^Ujk)+T^T^(ijk)+.  .  .+T^Tjj  (ijk)+T**T**(ijk)  +  .  .  . 

+T  +. . ,+T 

where  L  *  l;the  taus  are  main  effect  and  interaction 
parameters  and  the  T(ijk)  are  the  explanatory  variables, 
the  indicator  functions  of  the  corresponding  marginals. 
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e.g. 


I  T?-?  (ijk)  x  *(ijk)  =  x  *(12.) 
ijk 


x(12.)  etc. 


Prom  the  log-lir.ear  representation  of  x2  (ijk)  we  have 
the  log-linear  representation  of  the  mortality  log-odds 


or  logit  as 
* 

x,  (ijl) 

in  -=, - 

x2  ( i  j  2 ) 


+  x 


jk 

jl 


where  t 


ik 

51 


=»0= 


Since  the  computer  output  includes  log's  of  the 

6 

x2  we  can  evaluate  the  tau  parameters,  for  example, 
as  follows 

x/(511)  v 

Hn  -  =  T* 

x2  (512)  1 


in 


x2  (111) 
x2*(112) 


k  ik 
T1  +  T11 


in 


x2  (411) 
x2*(412) 


Tk  +  Tik 

T1  T41 


in 


x2  (521) 
x2*(522) 


k  .  jk 
T1  +  t21 


x5* (561) 

in  — x - 

x2  (562) 


+ 


T 


jk 

61 
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The  following  are  the  values  obtained 

x*  -  -7.4714  *  0.5017 

T11  “  "°*0849  *  0.9685 

T21  “  _0*4515  t4^  =  1-2848 

=  -0.2655  x^  =  2.2293 

xjj  *  0.0771  xj*  =  3.4785 

Sugiura  and  otake  used  the  representation  for 
the  log-odds 

log  {Pj^/Cl-Pj^)  >  =  y  +  ai  +  8j 
5 

where  l  ol.  *  0,  6.  *  0  and  give  the  estimates 
i=l  1  1 


ax  *  0.068  B2  *  0.502 

a  2  -  -0.299  63  =  0.969 

a3  -  -0.113  B4  -  1.285 

a4  -  0.190  35  -  2.229 

a5  -  0.153  B6  -  3.478 

We  note  that  x2*  ■  B2#-.-»  xg3  “  Bg  and 

.  k 

\i  +  a5  -  x^ 

k  ik 

y  +  ax  «  xx  +  x£ 

y  +  «2  “  T1  +  t21 

y  +  a3  “  T1  +  t31 
k  ik 

y  +  a4  -  T1  +  t41 
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that  is 


al = 

Tik  - 
T11 

1  ik 
(T11 

+ 

ik 

T21 

+ 

ik 

X31 

+ 

ik 

T41 

a2  = 

Tlk  - 
X21 

/  ik 
(T11 

+ 

ik 

t21 

+ 

ik 

T31 

+ 

ik 

t41 

a3 

tik  - 

X31 

(Tik 

VT11 

+ 

Tik 

T21 

+ 

Tik 

t31 

+ 

Tik 

T41 

a4  = 

XiK  - 

T41 

(Tik 

lTll 

+ 

Tik 

T21 

+ 

f  ik 
t31 

+ 

Tlk 

T41 

>/S 
>/5 
)/5 
)  /  5 


,  ik  ,  ik  .  ik  ,  ik.  /c 
“5  =  '(T11  +  t21  +  t31  +  t41>/5 


yielding  =  0.0680,  -  -0.2986,  a ^  “  -0.1126, 

a4  =  0.1900,  a5  *  0.1529. 

He  determine  the  covariance  matrix  of  the  tau's 
in  the  logit  representation  as  follows. 

Let  T  denote  the  60x40  matrix  whose  columns  are 


20 


r  - >  r.  .  -\  - r:  n 

l,  Tj(ijk) ,...,Tj(ijk),  TJ(ijk) ,...,T^(ijk) ,  TjJ(ijk) ,. . . ,Tjj(i jk) 

Tk(ijk),  T^k(ijk) ,...  ,T^(ijk) ,  T^(ijk),...,TjJ(ijk) 

IW.  ^  ^  ^  ■■■  '  *  -.1  1  ■  ^ 


and  let  D  denote  a  60x60  diagonal  matrix  whose  diagonal 

* 

values  are  X2  (i jk)  (in  the  same  ijk  sequence  as  the 
T(ijk)  functions). 

Compute  the  40x40  matrix  S  *  T'DT 


where  is  30x30  and  §22  i®  10x10 
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k  ik  ik  i  k  "i  k 

The  covariance  matrix  of  T2i#***'  T61 

is  then  given  by 

S22.1  =  (S22  "  S21S11S12)  ' 

The  covariance  matrix  thus  obtained  is  given 
in  Table  III. 

n  k 

To  compute  confidence  intervals  for  the  tj  *  s, 

following  the  procedure  suggested  by  Sugiura  and  Otake 
using  the  multiple  comparison  lemma,  Ferguson  (1967,  p.  282), 
we  computed  /ll.0l7(jxV  77  using  the  variances  in 

Table  III  and  obtained  the  following  confidence  intervals 


T21 

-0.5463 

1.5497 

Tjk 

T31 

-0.2295 

2.1665 

T41 

-0.2762 

2.8458 

Tjk 

X51 

0.9233 

3.5353 

T-jk 

T61 

2.4185 

4.5385 

The  confidence 

intervals  for  the 

ik 

r  's  were  obtained 

computing  /57 

488V  ..  using  the 

variances  in  Table 

leading  to 

Til 

ik 

T11 

-0.9689 

0.7991 

ik 

t21 

-0.4515 

0.4455 

ik 

X31 

-1.1525 

0.6215 

ik 

t41 

-0.7759 

0.8501 
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To  relate  with  the  bounds  given  by  Sugiura  and  Otake  for 
the  a' s,  since  we  have  seen  that 

5  {  11  21  31  4l'/:> 

we  have  that 


Var(a5)  |Var(TU>  +...+V«(t“)«J_oov(t“,t“)| 


.ik. 


.ik  _ik, 


m<n 


and  from  the  entries  in  Table  III  we  finally  find 


Var(a5)  =  0.0339,  leading  to  the  interval 
a5  (-0.4141,  0.7199). 

We  did  not  trouble  to  compute  the  others  as  it  is  evident 
that  the  results  are  the  same. 


In  the  output  corresponding  to  fitting  all  the  two- 

way  marginals,  the  entry  corresponding  to  the  cell  x(lll) 

had  a  large  OUTLIER  value  (5.239) .  Accordingly  we  fitted 

an  estimate  fitting  all  the  two-way  marginals  but  omitting 

the  values  x(lll) ,  x(112) .  This  estimate  is  denoted  by 
* 

xfi  (ijk)  and  its  values  are  given  in  Table  lie. 

The  associated  Analysis  of  Information  is 

Analysis  of  Information 

Component  due  to _ Information _ D.F. 

x(ij.),  x(i.k)  ,  x(.jk)  2I(x:x2  )  =  27.847  20 

*  * 

as  above  but  omitting  x(lll) ,x (112)  2l(xe  :Xj  )  =  6.223  1 

21 (x :x  *)  »  21.614  19 

e 
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Removing  x(lll) ,  x(112)  from  the  estimation  gives 

an  improved  fit.  We  did  not  carry  ou*  any  extensive 

* 

analysis  with  x  (ijk)  but  did  note  the  approximate 
equality  of 


jk  -  T 
61  t51  ' 


T 


jk 

21 


when  computed  for  x^  and  x0  ,  the  respective  values 

6  i. 

being 


*  * 


xe 

X2 

1.249 

1.249 

0.949 

0.944 

0.320 

0.316 

0.466 

0.467 
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TABLE  lib 

Fitting  marginals  x(ij.),  x(.jk) 


Ex tuple  3 


t 


'  * 


w 


■* 


Automobile  accident  data.  This  exaaple  Illustrates  the 
analysis  of  a  four-way  3x4x3x2  contingency  table.  It 
points  out  that  tha  model  fitted  determines  the  form  of 
the  log-odds  or  logit  representation ,  but  the  converse  Is 
not  true.  Tha  covariance  matrix  of  the  estimated  parameters 


is  given. 


1 


Example 

Automobile  Accident  Data  -  Driver  Ejection 

Data  used  on  this  example  are  taken  from  a  study  of  the  relationship 
between  car  size  and  accident  injuries  as  given  in  Kihlberg  et  al,  (1964) . 
The  observed  data  are  given  in  Table  1  and  the  observed  occurrences  are 
denoted  by  x(ijkf.)  where 


Characteristic 

Index 

1 

2 

3 

4 

Car  weight 

i 

Small 

Compact 

Standard 

Accident  type 

j 

Col' ision 
with  vehicle 

Collision 
with  object 

Rollover 

without  collision 

Other 

rollover 

Severity 

k 

dot  severe 

Mod.  severe 

Severe 

Driver  Ejection 

l 

Not  ejected 

Ejected 

A  condensed  2x2x2x2  version  of  this  data  was  studied  by  Bhapkar  and  Koch 
(1963)  and  Ku  et  al.  (1^63). 

Since  the  question  of  interest  is  the  possible  relation  of  driver 
ejection  on  car  weight,  accident  type  and  severity,  we  start  the  fitting 
sequence  witli  the  marginals  x(ijk.),  x(...£).  This  first  estimate, 
x*(ijk£)  “  x(ijk. )x( . . . £) /n,  corresponds  to  a  null  hypothesis  that  driver 
ejection  is  homogeneous  over  the  36  combinations  of  the  other  character¬ 
istics.  As  may  be  seen  from  the  analysis  of  information  table  this 
hyp-'thesis  is  clearly  rejected  by  the  data.  It  is  found  that  fitting  the 
model  incorporating  in  addition  to  x(ijk.)  t he  marginals  x(i..£),  x(.j.£), 
x(..k£),  that  is,  the  interactions  of  car  weight,  accident  type,  and 
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severity  respectively  with  driver  ejection,  a  satisfactory  fit  to  the 

observed  data  is  obtained  The  models  fitting  in  addition  three-way  • 

1 

marginals  x(ij.Jl),  etc.,  showed  no  significant  effects  for  the  associated 

interaction  parameters.  The  results  are  summarized  in  the  analysis  of  I 

information  table. 


Analysis  of 

Information 

Component  due  to 

D.F. 

a) 

x(ijk.)  , 

x( .  . 

.*) 

2I(x:x*) 

=  613.102 

35 

b) 

x(ijk.). 

x(i. 

.1), 

x(. j .1) , 

x(.  .ki,) 

2Kx*:x*) 

-  537.584 

7 

21(x:x*) 

-  25.518 

28 

c) 

x(ijk. ) , 

x(ij 

■  JO, 

x(i.ki) , 

x(.  jki,) 

21(xj:x*) 

«=  14.491 

16 

2I(x;x*) 

c 

-  11.028 

12 

The  fitted  values  x*(ijk£)  are  given  in  Table  2.  The  log-linear 

regression  representation  of  x*(ijki)  contains  the  parameters  L  (a 

normalizing  constant),  T*,  T*,  tJ,  t^,  T^,  tJ,  t|J,  T*,  tJJ,  tJ:[, 

ij  ij  ij  ik  ik  ik  ik  a  U  jk  jk  jk  jk  Ik 

21’  22’  23’  11"  12’  21’  22’  Tll’  21’  11’  T12’  T21*  22’  31’ 

Jk  j(>  j£  jJl  kil  k£  ilk  ijk  ijk  ijk  ijk  ijk  ijk  ijk 

32’  11’  t21*  5 31’  Tll’  T21’  Tlll’  T112’T121’T122’T131’  132’  211’T212’ 

T221’  T222 *  t231’  t232*  T*ie  additional  parameters  which  would  appear 
in  the  complete  model  for  x(ijkf.)  are  hypothesized  as  zero  and  represent 
the  28  degrees  of  freedom  of  2I(x.x*).  The  log-odds  or  logit  representa¬ 
tion  for  the  estimate  x*  is 


„  xb(ljkl)  _  l  .  _U  ,  ji 
Zn  x*(ijk2)  "  T1  Til  Tjl 


ji  ,  U 

+  Tkl 


Parameters  not  involving  %  are  common  to  numerator  and  denominator  of  the 
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75 


3  - 


odds  mul  drop  out.  The  values  of  the  parameters  may  be  obtained  as 


,  **(3431) 

'l  ‘  la  ic* (34 32) 


U  ,  t(1'l31)  i 
Tn  * in  ai m  -  T1 

b 


T 


if. 

21 


x*(2431) 
£n  x* (2432)" 


T 


l 

1 


etc. 


The  values  of  tlie  parameters  are  (in  this  case  provided  as  computer 
output) 


£ 

T1 a 

-0.0083 

tJJ  -  1.3665 

k£ 

T11 

if. 

Tll- 

-0.2036 

-  1.1139 

k£ 

T21 

it 

X21* 

-0.0788 

-  -0.2405 

1.6035 

0.3823  . 


We  recall  that  any  parameter  with  a  subscript  i«3  and/or  j*4  and/or  k=3  and/ 
or  1=2  is  by  convention  zero. 

It  is  important  to  note  that  the  estimate  x*(ijk£)  obtained  by 
fitting  the  two-way  marginals  x(ij..),  x(i.k.),  x(i..£),  x(.jk.),  x( . j . £) ,x( . . k£) 
would  also  have  the  log-odds  or  logit  representation 


x*(ijkl)  ^  ^  n  _  _k 

'  *(ijk2f  T1  +  Til  +  Tjl  +  Tkl 


A 

A, 


The  values  of  the  parameters  would  depend  however  on  the  values  of  the 
estimate  x|(ijk£). 
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The  model  fitted  determines  the  form  of  the  log-odds  or  logit 
representation  but  the  converse  is  not  true. 

For  easier  interpretation  of  the  numerical  values  we  use  the 
representation  of  the  estimated  odds  as  the  multiplicative  model 

X*(iJkl)  *  il  j£,  u 

X*0jk27  *  «*P(T1)“P(Til)  exP(Tkl) 

The  factors  which  determine  the  odds  of  not  ejected  for  any  combination 
of  the  characteristics  are; 


Factors 


Base 

Car  weight 

Accident  type 

Severity 

0.99 

Small  0.75 

Collision  with  vehicle 

3.92 

Not  severe  5.00 

Compact  0.92 

Collision  with  object 

3.05 

Mod.  severe  2.42 

Standard  1.00 

Rollover  without  collision 

0.79 

Severe  1.00 

Other  rollover 

1.00 

By  selecting  the  combination  of  characteristics  with  the  largest 
factors,  it  is  seen  that  the  best  odds  for  not  ejected,  19.40,  occur  for 

Standard,  Collision  with  vehicle,  Not  severe. 

By  selecting  the  combination  of  characteristics  with  the  smallest  factors, 
it  is  seen  that  the  worst  odds  for  Not  ejected,  0.59,  occur  for 

Small,  Rollover  without  collision,  Severe. 

The  observed  odds  for  Not  ejected  from  the  original  data  are  4124/707*5.83. 
The  estimated  odds  for  any  combination  of  characteristics  is  easily 
obtained  from  the  values  of  x£. 
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The  covariance  matrix  of  the  parameters  for  the  estimate  x*  is 
Eiven  in  Table  3. 


Accident 

type 


Collision 

with 

vehicle 


Collision 


Hoi lover 

without 

Collision 


Other 


Table  1 

Accident  Data  -  Drivers  Alone  -  Observed 

Accident _ Not  Ejected _ _ l’jected _ _ 

severity  ~Tsmall  Compact  Standard!  Small  Compact  Standard 


Not  severe 

95 

166 

1279 

Mod.  severe 

31 

34 

50b 

Severe 

11 

17 

186 

Not  severe 

34 

55 

599 

Mod.  Severe 

d 

34 

241 

Severe 

5 

10 

39 

Not  severe 

23 

13 

65 

Mod.  severe 

22 

17 

118 

Severe 

5 

2 

23 

Not  severe 

9 

10 

33 

Mod.  severe 

23 

26 

177 

Severe 

8 

9 

86 

Table  2 


\ 


Accident  data  -  Drivers  Alone  -  Estimate  x* 


Accident 

Accident 

Not  ejected 

Ejected 

type 

severity 

Small 

Compact 

Standard 

Small 

Compact 

Standard 

Collision 

Not  severe 

96.349 

163.874 

1278.209 

6.651 

9.126 

65.790 

with 

Mod.  severe 

28.879 

34.973 

503.433 

4.121 

4.027 

53.567 

vehicle 

Severe 

11.154 

17.212 

190.913 

3.846 

4.788 

49.087 

Collision 

Not  severe 

35.817 

56.919 

604.917 

3.183 

4.031 

40.082 

with 

Mod.  severe 

8.448 

33.095 

234.832 

1.552 

4.905 

32.167 

object 

Severe 

3.463 

8.099 

89.406 

1-537 

2.901 

29.594 

Rollover 

Not  severe 

21.572 

18.000 

60.475 

7.428 

5.000 

15.525 

without 

Mod .  severe 

23.367 

16.516 

121.512 

16.633 

9.484 

64.488 

Collision 

Severe 

3.676 

3.351 

24.535 

6.324 

4.649 

31.465 

Not  severe 

11.804 

9.849 

78.213 

3.196 

2.151 

15.787 

Other 

Mod.  severe 

23.082 

28.936 

179.924 

12.918 

13.064 

75.076 

Rollover 

Severe 

6.377 

7.174 

85.645 

8.623 

7.826 

86.355 

273.988 

397.998 

3452.014 

76.012 

72.002 

558.983 

98 
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Table  3 

Covariance  matrix  -  parameters  of  estimate  x* 


l 

T1 

i£ 

T11 

U 

T21 

11 

21 

31 

U 

11 

Tk* 

21 

.0017 

.0003 

.0003 

.0005 

.0003 

.0003 

.0005 

.0003 

.0039 

-.0003 

.0000 

-.0001 

.0005 

.0001 

.0001 

.0027 

.0001 

.0000 

.0001 

.0001 

.0000 

.0000 

-.0005 

-.0004 

.0003 

.0000 

.0012 

-.0003 

.0002 

.0000 

003G 


0001 


0003 


1 


Example  4.  Minnesota  high  school  graduates  of  Juno  1938.  This  exanple 
Illustrates  the  analysis  of  a  four-way  2x3x7x4  contingency 
fehla.  In  particular  the  "dependent"  classification  ia  not 
dlchotoaous  as  in  the  previous  examples  but  has  four 
categories.  The  final  nodal  lends  to  log-odds  representations 
Involving  aaln  effects  and  Interactions. 


i 


i 

I 

( 

r  I 
' 
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Example 

Classification  of  Minnesota  High  School 
Graduates  of  June  1938 

The  data  of  this  2x3x7x4  contingency  table  represents  a  four-way 
cross  classification  of  the  April  1939  status  of  13,908  Minnesota  High 
School  graduates  of  June  1938.  The  data  was  presented  by  Hoyt  et  al. 

(1959).  They  formulated  and  tested  various  hypotheses  of  independence 
using  chi-squared  statistics.  The  same  data  was  also  used  by  Kullback 
et  al.  (1962b)  to  illustrate  the  use  of  the  minimum  discrimination 
information  statistics  in  the  analysis  of  various  hypotheses  of  independence 
and  homogeneity.  Patil  (1974)  condensed  the  original  data  into  a  4x3x7 
table  by  summing  over  the  sex  classification  and  tested  for  no  second- 
order  interaction  in  the  three-way  table  by  an  asymptotic  chi-squared 
statistic . 

We  shall  examine  models  fitting  certain  sets  of  marginals  and 
analyze  the  data  on  the  basis  of  the  log-linear  representation  of  a  model 
that  well  fits  the  data.  The  original  data  is  listed  in  Table  1  where  we 
denote  the  occurrences  in  the  cells  by  x(hijk),  with 


Characteristic 

Index 

1 

2 

3 

'  4 

5 

|6 

i 

,7 

Sex 

h 

Male 

Female 

|  " 

H.S.  Rank 

i 

Lowest 

third 

Middle 

third 

Upper 

third 

Father's  Occupational 
Level 

j 

1 

2 

3 

4 

5 

6 

7 

Post  H.S.  Status 

k 

Enrolled 

in 

College 

Woncollegiate 

school 

Employed 
full  time 

Other 
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The  problem  is  to  determine  the  relationship  of  post  high-school 
status  on  the  other  variables.  Note  that  here  the  '  dependent’1  variable 
is  polychotomous.  Ue  summarize  in  the  analysis  of  information  Table  3,  the 
results  of  fitting  three  models  to  the  data,  or  the  sets  of  marginals, 

lta:  x(hij-),  x( •  •  *k)  , 

1^:  x(hij  •) ,  x(h*°i.),  x(*i*k),  x(**jk)  , 

Hc:  x(hij-),  x(*i*k) ,  x(h*jk)  . 

The  estimate  x*,  corresponding  to  H  ,  is  to  determine  whether  the  occur- 

cl  fl 

rences  of  post  high-school  status  arc  homogeneously  distributed  over  the 

42  combinations  of  sex,  high-school  rank,  and  father's  occupational  level. 

Ue  note  that  x*(hijk)  ■  x(hij*)  x(***k)/n.  Since  the  data  do  not  support 
a 

the  null  hypothesis  of  homogeneity  we  consider  the  estimate  x*  corresponding 
to  11^.  This  estimate  v/ill  provide  a  log-odds  or  logit  representation  in 
terms  of  a  linear  combination  of  the  main  effects  of  sex,  high-school  rank 
and  father's  occupational  level  on  post  high-school  status.  Since  the  fit 
of  the  estimate  x*  to  the  data  was  not  considered  satisfactory  the  effects 
of  various  interactions  associated  with  three-way  marginals  wti.e  examined. 
The  interaction  with  the  largest  effect,  for  the  additional  degrees  of 
freedom,  turned  out  to  be  that  of  sex  x  father's  occupational  level  x 
post  high-school  status,  that  is,  associated  with  the  marginal  x(h*jk). 

It  was  decided  to  analyze  the  data  in  terms  of  the  estimate  x£  corresponding 
to  11^ .  The  values  of  x*(hijk)  are  listed  in  Table  2. 

From  the  log-linear  representation  of  the  estimate  x*,  we  arrive 
at  the  following  representation  for  the  log-odds 


102 


-  3  - 


x*(hij  1)  h!.  lk  Jk  hjk 

4n-^(hipo"Ti  +  Ti.i  +  Tii  +  Tji  +  Thji  * 


x*(liij2)  k 


..  ,  hk  L  ik  J  Jk ■  hjk 

x*(hij4)  1 2  h2  Ti2  rj2  hj2  * 


S.n 


Xc(hlj3)  k  hk  ik  jk  hjk 

.  .  -  -  b  T  .4*  T  4*  T  4-  T  *■'  4.  T  J 

x£(hij4)  3  h3  13  j3  hj3 


The  values  of  the  parameters  in  the  log-odds  representations  arc : 


k 

T1 

hk 

T11 

Tlk 

T11 

ik 

T21 

TJk 

T11 

jk 

T21 

Tjk 

31 

TJk 

T41 

jk 

T51 

r1k 

61 

rkjk 

‘ill 

lijk. 

121 

hjk 

131 

hjk 

141 


-1.0345 

k 

T2 

-  -2.2548 

k 

T3 

*  -1.7189 

0.9935 

!ik 

T12 

-  -0.3523 

hk 

T13 

-  -0.1111 

-1.5908 

ik 

T12 

-  -1.0060 

ik 

T13 

-  -1.0682 

-0.8912 

ik 

T22 

-  -0.4542 

ik 

X23 

-  -0.4034 

2.2731 

Tjk 

T12 

-  0.9905 

TJk 

X13 

-  0.3593 

1.2332 

Tjk 

22 

-  0.9822 

TJk 

23 

-  0.6872 

0.4009 

Tjk 

32 

-  0.3932 

TJk 

1 33 

-  0.6333 

1.1259 

Tjk 

T42 

-  0.3881 

TJk 

T43 

-  0.6099 

0.6194 

r>k 

-  0.3995 

Tjk 

53 

-  0.5254 

-0.0321 

TjU 

62 

-  -0.1397 

tJk 

63 

=  0.1939 

-0.7277 

hjk 

t112 

*»  -1.3054 

hjk 

113 

-  -0.4037 

-0.6340 

hjk 

X122 

-  -0.8013 

hjk 

T123 

=  -0.3643 

-1.0923 

hjk 

132 

-  -0.8080 

hjk 

T133 

=  -0.9709 

-0.8463 

-hjk 

t142 

-  -0.7581 

hjk 

143 

-  -0.5573 
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▼ 


'j1,^  -  -0.6402 

1^2  K  -0.3605 

hjh 

153 

'.‘J!!  -  -0.7537 

In  I 

1 162  ‘  -°-233'> 

Ti>jl 

163 

-0.5503 

-0.4397 


All  parameters  with  subscripts  h“2  and/or  is3  and/or  j*7  and/or 
L»4  are  zero  by  convention. 

From  the  representation  for  the  log-odds  it  is  seen  that  the 
association  between  high-school  rank  and  post  high-school  status  is 
independent  of  the  combination  of  sex  and  father's  occupational  level, 
that  is, 


x*(hijl)  x*(h2jl)  x*(hljl)x*(h2j4) 

£n  x*(hlj4)  '  ln  x*(h2j4)  "  £n  x* (hi  j  4)  x*  (h2 j  1) 
c  c  c  c 

Ik  Ik 

-  T11  “  T2i  “  -0*6°96  , 

X*(h2jl)x*(h3j4) 

£n  x*(h2j4)x*(h3jl)  "  T21  "  “n-3°12  * 


X*(hlj2)x*(h2j4)  iR  k 

4(hWxf0^j2T  -  T12  -  X22  “  -°'5518 


X*(h2j2)x*(h3j4) 

£n  x*(h2j4)x*(h3j2T  "  T22  "  ~0,454?  » 


.  *J<hU3)x*(h2j3)  '  _ik  lk  _ 

£n  x*  (hi  j  4)  x*  (t»2  j  A)  "  XJ3  '  T23  "  ~0*5748  » 

x*  0*2  j  3)  x*  (h3j  3)  .. 

pn  . ■ m  i  .  m  t  ■  _ n  aq 34 

*n  x*(h2j4)x*(h3j4)  X23  ’ 
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The  association  between  sex  and  post  high-school  status  is  of 


course  dependent  on  father's  occupational  level,  that  is, 

.  X2<11J1)  .  i,k  ,  i,jk 

n  x*(lij4)  "  n  X*(21j4)  T11  Tljl  • 

c  c 

xj(llj2)  x*(2iJ2) 

En  jfuTFT  -  *“  4fIW*  T12  +  *1J2  • 

»g(UJ3)  hJk 

X*(11J4)  "  x*(2ij4)  '  T13  T1J3  ' 


We  summarize  the  numerical  values  below. 


j 

+  T1,Jk 

T11  +  Tljl 

hi-  ,  hJk 

T  +  T  J 

t12  Tl.i  2 

hk  hjk 

13  1J3 

1 

0.265C 

-1.6577 

-0.5148 

2 

0.3595 

-1.1541 

-0.4754 

3 

-0.0988 

-1.1603 

-1.0020 

4 

0.1472 

-1.1104 

-0.66S4 

5 

0.3533 

-1.2120 

-0.6619 

6 

0.2348 

-0.5857 

-0.5508 

7 

0.9935 

-0.3523 

-0.1111 

We  remark  that  father's  occupational  level  3  shows  a  peculiarity 

as  compared 

to  other  values 

in  the  first  column 

above.  Kullback  ct  al. 

(1962b,  p.  593)  noted  that 

there  was  an  unusually  larger  number  of  girls 

than  boys  for  the  third  category  of  father's  occupation.  Apparently 

there  was  a 

tendency  for  the  girls  not  to  enroll  in  college  as  compared 

to  the  boys. 

In  particular 

,  for  example,  the  association  between  sex  and 

collegiate  or  noncollegiate  school  is 
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Xc(lljl)  _  ln  Xc(21jl)  .  Tl*  +  Thjk  hk  hjk 

x* (lij 2)  x*(2ij2)  T11  Tljl  T12  lj2  ‘ 


Trora 

j 


the  preceding  results  we  have 


hk  hjk  hk  hjk 
T11  Tljl  "  T12  "  Tlj 2 


1 

O 

L. 

3 

4 

5 

6 
7 


1.9235 

1.5136 

1.0G15 

1.2576 

1.5661 

0.8205 

1.3458 


The  association  between  father's  occupational  level  and  post 
hij’h-school  status  is  dependent  on  the  sex,  that  is. 


*;(hui) .  _  .  Ti* + Thjk 

x*(hil4)  Xn  x*(hi74)  T11  +  Thll  » 

c  c 


x*(hi21)  x*(hi71)  ..  ... 

■C,  ..  _ _  _ £ _  m  -rJk  _hjk 

x*(hi24)  x*(hi74)  21  h21  * 

c  c 


etc. 


*;(hU2)  xi |l,172)  ^  ,  Lnik 

x*(hil4)  '  x*(hi74)  12  hl2  * 

c  c 


x*(hi22)  x* (hi 72)  h 

£n  x*(hi24)  ”  x*(hi74)  “  X22  +  Th22  ’ 

c  c 
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etc . 


In 


x*(hil3) 
c _ 

x*(hil4) 

c 


Jin 


x*(hi73) 

c 

x* (hi  74) 
c 


TjU  +  Thjk 
13  hl3 


Jin 


x*(hi23) 

c 

x*(hl24) 

c 


x*(hl73) 
111  x*  (hi  74) 


TJk  +  Thjk 

23  h23 


etc . 


A  tabulation  of  these  associations  is 


h-1 

h“2 

j 

k“l 

k-2 

k-3 

k-1 

k-2 

k-3 

1 

1.5094 

-0.3149 

0.4556 

2.2731 

0.9905 

0.3593 

2 

0.5992 

0.1804 

0.3229 

1.2332 

0.9822 

0.6872 

3 

-0.6914 

-0.4148 

-0.3376 

0.4009 

0.3932 

0.6333 

4 

0.2796 

0.1300 

0.0526 

1.1259 

0.8881 

0.6099 

5 

-0.0208 

-0.4610 

-0.0254 

0.6194 

0.3995 

0.5254 

6 

-0.7908 

-0.3731 

-0.2403 

-0.0321 

-0.1397 

0.1989 

In  particular,  the  association  between  father's  occupational  levels 
1  ami  2  and  post  high-school  status  of  collegiate  and  noncollegiate  school, 
for  boys,  is 


x*(lill) 

£n  x*(lil2) 
c 


x*(li21) 

x*(li22) 

c 


Jk  +  Tl'jk 


11 


111 


hjk  jk  hjk 
112  21  121 


+  r 


jlc 

22 


+  T 


hjk 
122  * 


We  shall  not  pursue  this  matter  any  further  here.  The  reader 
should  be  able  to  examine  any  particular  associations  of  interest. 
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Frequency  for  each  High-School  Rank  x  Post  lligh-School  Status  x  Sex 
Father's  Occupational  Level  Combination 
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cl 

b)  x ( lii J  * ) ,  x(h**k',  x(*i*k)  ,  x(**jk)  2I(x*.x*)  *  2672.724  27 

O  fl 

2I(x:x*)  -  151.710  96  / 

c)  x(hij •)  ,  x(*l*k) ,  x(h* jk)  2I(x*:x*)  -  52.850  18 
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Example  5.  Coronary  heart  dlseaae  risk.  This  example  Illustrates  the 


analysis  of  a  three-way  2x4x4  contingency  table.  It 
Illustrates  the  test  of  equality  of  certain  parameters  In 
the  model  of  no  second-order  interaction,  both  by  computing 
the  estimate  Implied  by  the  hypotheaized  relation  among  some 
of  the  parameters,  and  also  by  computing  the  appropriate 
quadratic  approximation. 
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Example 


Coronary  Heart  Disease  Risk 

We  are  indebted  to  Professor  S.  Greenhouse  and  J.  Cornfield  (1962) 
for  calling  our  attention  to  this  set  of  data. 

In  this  example  we  analyze  data  from  a  3-way,  R  x  S  x  T,  table 
resulting  from  a  coronary  heart  disease  study.  We  denote  the  observed 
values  by  f(ijk),  where 


Characteristic 

Index 

1 

2 

3 

4 

Coronary  heart  disease 

R 

i 

yes 

no 

Serum  cholesterol,  mg/100  cc 

S 

J 

<  200 

200-219 

220-259 

260  + 

Blood  pressure,  mm  Hg 

T 

k 

<  127 

127-146 

147-166 

167  + 

We  ask  the  reader's  Indulgence  for  not  using  the  notation  used  elsewhere 
in  this  report,  that  is,  x(ijk) ,  x*(ijk),  etc. 

cl 

The  complete  2x4x4  table  is  given  in  Fig.  1.  A  preliminary 
analysis  is  given  in  the  analysis  of  information  table  shown  in  Fig.  2, 
where  the  various  sets  of  marginal  constraints  and  the  corresponding 
information  values  and  degrees  of  freedom  are  listed.  Interaction 
hypotheses  corresponding  to  sets  of  marginal  constraints  in  the  table  are 

li  :  p(ijk)  -  pd^OpC’jk) 
a 
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P(ijk) 


-  2  - 


.  PJLlJ/jj.CJk? 
p(*J‘) 


:  no  second-order  interaction. 


The  effects  due  to  addition  of  each  of  the  three  2-way  marginal 

tables  are  shown  immediately  above  these  interactions.  We  note  that 

both  the  Information  values  and  the  degrees  of  freedom  are  additive. 

This  analysis  Indicated  that  a  fit  to  this  set  of  data  could  be 

made  adequately  using  as  explanatory  variables  the  marginal  cell 

frequencies  of  three  marginal  tables  of  dimensions  2  x  4,  2x4,  and 

4x4.  The  hypothesis  tested  was  that  of  no  second-order  interaction 

in  the  sense  of  Bartlett  [1935],  as  discussed  by  Ku  et  al.  (1971).  We 

start  with  H  because  our  first  concern  is  whether  the  incidence  of 
a 

coronary  heart  disease  is  homogeneous  over  the  factors  serum  cholesterol 

and  blood  pressure.  Thus  considering  21(f:f  )  in  Fig.  2  as  the  total 

a 

"unexplained  variation"  we  may  set'  up  the  summary  analysis  of  Information 
table  in  Fig.  3. 

The  interpretation  of  the  no  second-order  interaction  hypothesis 

is : 

a.  The  association  between  blood  pressure  and  heart  disease  is  the  same 
for  different  levels  of  cholesterol, 

b.  The  association  between  cholesterol  level  and  heart  disease  is  the 
same  for  different  levels  of  blood  pressure, 

c.  The  association  between  cholesterol  level  and  blood  pressure  is  the 
same  for  subjects  with  and  without  heart  disease.  For  the  estimate 
f*  under  the  model  of  no  second-order  Interaction  the  log-odds 
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(logit)  of  the  estimated  incidence  of  coronary  heart  disease  is  a 
linear  additive  function  of  an  average  effect,  an  effect  due  to 
cholesterol  and  an  effect  due  to  blood  pressure,  i.e.. 


f*(ljk) 

JLn  — - 

f*(2jk) 


+ 


Values  of  f£  are  shown  in  Fig.  4  and  the  design  matrix  in  Fig.  5. 
We  note  that  there  are  22  parameters,  in  addition  to  Tq,  to  be  estimated 
from  the  f*  values.  A  complete  model  would  Include  nine  additional 
parameters,  which,  under  the  no  second-order  Interaction  hypothesis,  are 
equal  to  zero,  i.e.. 


ijk  . 

TiJk  - 

TiJk 

Ill 

T112 

T113 

ijk  - 

T1Jk  - 

ijk 

121 

T122 

T123 

ijk  „ 

T*Jk  - 

xiJk 

131 

T132 

^133 

0  , 


0  , 


0  . 


We  note  that  the  number  of  parameters  in  the  complete  model  is 
23  +  9  ■  32,  that  is,  the  number  of  cells. 

The  computation  of  the  T  parameter  estimates  is  straightforward, 

e.g., 

4  f*(144) 

t:  -  In  — - -  -  0.9374  , 

1  f$(244) 

etc.  The  values  of  the  f's  are  listed  in  Fig.  6.  For  simplicity  we  use 
T  with  no  further  diacritical  marking. 
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When  Che  "dependenc"  variable  or  response  variable  is  dichotomous, 
odds  and  log-odds  have  long  been  used  as  indices  indicative  of  risk. 

The  estimated  log-odds. 


f*(ljk) 

in  — - 

f*(2jk) 


T1  +  Tij 

T1  Tlj 


+ 


and  the  estimated  odds, 

f*(ljk) 

f*(2jk) 

are  given  in  Fig.  7. 


From  the  design  matrix  or  the  representation  of  the  log-odds 
we  can  compute  the  difference  in  log-odds  of  risk  of  heart  disease  for 
change  in  blood  pressure  and  constant  cholesterol  concentration  in 
terms  of  the  T  parameters,  e.g., 


fi(lj2)  f*(ljl)  f*(112)  f*(lll) 

in  -t -  -  in  — -  -  in  — -  -  in  — - 

f*(2j2)  fj(2jl)  fj(212)  f*(211) 


0.0415 


Similarly, 

f*(lj3)  f*(lj2) 

in  _£ -  -  in  — -  -  0.5738  , 

f*(2J3)  f*(2j2) 

f*(lj4)  f$(lj3) 

in  _£ -  -  in  — -  -  0.6681  . 

f$(2j4)  fj(2j3) 
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The  differences  in  log-odds  for  change  in  cholesterol  level  and 
constant  blood  pressure  are: 


f 5 (12k)  f*(llk) 

in  — -  -  In  -- -  -  -  0.2079  , 

f*(22k)  f*(21k) 


f*(13k)  f*(12k) 

in  _£ -  -  in  _£ -  -  0.7702  , 

f*(23k)  f*(22k) 


f*(14k)  f*(13k) 

in  _± -  -  in  — -  -  0.7818  . 

f*(24k)  f*(23k) 


The  differences  in  log-odds  for  change  in  cholesterol  level  and 
change  in  blood  pressure  are 


f  5 (122)  f*(lll) 

in  -  _  in  -t -  -  -  0.2494  , 

f*(222)  f$(211) 


f*(133)  f 5 (122) 

in  -  -  in  — -  -  1.3440  , 

f *(233)  f*(222) 


f*(144)  f 4(133) 

in  — -  -  Jin  — -  -  1.4499  . 

f*(244)  f * ( 233) 


In  view  of  the  negative  values  of  the  changes  in  log-odds 

t  ,  .  ik  ik  ij  ij  .  .  ... 

represented  by  -  T^i  *  Ti2  “  T11  »  we  wish  t0  cI,eck  the 

hypothesis  that 
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which  would  Imply  Chat  the  risk  does  not  begin  to  manifest  itself 
significantly  until  the  chjlestrol  level  and  blood  pressure  exceed  some 
minimum  level,  that  is,  a  threshold  effect.  Let 


Z 


1 


-  0.2079 


Z 


2 


-  0.0415  . 


The  variance-covariance  matrix  of  the  taus  for  f£  is  obtained  as 
follows  (a  weighted  version  of  Kullback  (1959,  p.  217): 

Compute  S  *  T'DT  where  T  is  the  32  x  23  design  matrix  for  the  log-linear 
representation  of  f*  in  Fig.  5  •  end  D  is  a  diagonal  matrix  whose  entries 

4 

are  the  values  of  f£  in  the  order  of  the  rows  of  the  design  matrix. 
Partition  the  matrix  S  as 


where  S..  is  1  x  1  , 
**11 


Then  the  variance-covariance  matrix  of  the  taus  is 


"  ^21^11-12 


or 


S'1 

-22*1 


The  covariance  matrix  of  Z^,  Z 2  is  found  to  be: 


an  -  a8’8  +  o9,9  -  2o8’9  -  0.2175 


8,11  9,11  8,12  .  9,12 

■  a„,  ■  a  -o’  -o’  +  o  ’ 


12  “21 


-  0.0013 
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/ 

-**" 


0.09’2  . 


a22  -  o11-11  +  a12*12  -  2a11*12  - 


We  found 


A-1  .  /“u  *12  ]  .  (*■ 

“  1*21  *22/  \°- 

X2  .(ZV  Z2)  A'1  /^j-  0 


5981  0.0648 


0648  10.8469 


) 


.2185 


does  not  exceed  the  upper  5%  critical  value  of  a  chi-squared  variate  with 
2  degrees  of  freedom. 

For  this  particular  hypothesis,  we  may  alternatively  revise  the 

ii  H  ik  ik 

design  matrix  by  combining  the  columns  with  »  *nd  with  » 

and  use  the  Iterative  procedure  suggested  by  Gokhale  [1972],  Kullback 
[1973]  for  "unusual  marginal  totals"  to  obtain  the  estimated  cell 
frequencies.  The  resulting  estimates  fj  are  given  in  Fig.  8.  In 
Fig.  9  are  listed  the  log-odds 

fjdjk) 

In  _S - 

f3<2jk) 


and  the  odds  f J(ljk)/f J(2jk) .  The  associated  analysis  of  Information 

table  is  shown  in  Fig.  10.  Note  that  2I(f*:fJ)  is  a  test  of  the 
ii  ii  ik  ik 

hypothesis  that  1  “  t12  and  is  aPProxlmated  by  the  te8t 

previously  given  as  a  quadratic  chi-squared  variate. 
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Component  due  to 

Information 

D.F. 

1) 

f (i* *) »  f(*J‘),  f(**k) 

21(f:f*) 

-  83.149 

24 

a) 

f(i“),  f(*jk) 

ST  effect 

2l(f*:f*) 

-  24.423 

9 

Independence  R  x  ST 

2I(f :f *) 
a 

-  58.726 

15 

b) 

f  (*jk) ,  f(ij-) 

RS  effect/ST 

b  a 

-  31.921 

3 

Conditional  Independence 

R  x  T/S 

2l(f:f*) 

-  26.805 

12 

2) 

f(*jk),  f(ij‘),  f(i*k) 

RT  effect/ST,  RS 

2l(f*:f*) 

-  18.730 

3 

Second-order  interaction 

2l(f:f*) 

-  8.075 

9 

Figure  2.  Analysis  of  Information  -  Coronary  Heart  Disease  Risk  Data 
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Component  due  to 

Information 

D.F. 

f(i“),  f(-jk). 

Total 

2I(f:f*)  -  58.726 

A 

15 

f(ij‘). 

Cholesterol  effect 

2I(f*:f*)  -  31.921 
d  a 

3 

f(-Jk),  f(ij-),  f (i*k)  , 

Blood  Pressure  effect 
given  Cholesterol 

2I(fJ:f*)  -  18.730 

3 

Second-order  interaction 

(Residual) 

2l(f:fJ)  -  8.075 

9 

Figure  3.  Analysis  of  Information 
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1  L 

j: 

Serum 

k: 

blood  pressure,  mm 

Hg 

cholesterol, 

1 

2 

3 

4 

mg/100  cc 

<  127 

127-146 

147-166 

167  + 

Total 

1 

<  200 

3.550 

3.553 

2.488 

2.409 

12.000 

2 

200-219 

2.144 

2.340 

1.754 

1.762 

8.000 

CHD 
i  =  1 

3 

220-259 

6.501 

10.827 

6.227 

7.446 

31.001 

4 

260  + 

7.805 

11.287 

9.531 

12.382 

40.998 

Total 

20.000 

28.000 

20.000 

23.999 

91.999 

1 

<  200 

115.450 

120.447 

47.512 

23.591 

307.000 

2 

200-219 

85.856 

97.660 

41.246 

21.238 

246.000 

NHCD 
i  -  2 

3 

220-259 

120.499 

209.173 

67.773 

41.554 

438.999 

4 

260  + 

66.196 

99.720 

47.469 

31.617 

245.002 

Total 

388.001 

527.000 

204.000 

118.000 

1237.001 

TOTAL 

408.001 

555.000 

224.000 

141.999 

1329.000 

Figure  4.  Estimated  Cell  Frequencies  under  No  Second-Order 

Interaction  Hypothesis,  ft,  Coronary  Heart  Disease 
Risk. 
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Figure  7.  Log-odds  and  Odds 
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Example  6 


Hospital  data.  Thl.  «•*>!.  lUu.tr.taa  the  analyl.  ol  a 
pair  ol  related  three-way  2x2x2  contingency  table..  I» 
particular  It  lllo.tr.te.  the  procedur.  to  obtain  an 
eatlnate  aatlalylng  certain  obaerwed  -rglnal  reatr.lnt. 
aud  hawing  certain  ol  the  tan  palter,  predetemlned. 
that  la.  the  "Inheritance"  ol  certain  p-ranetere.  It  alao 
mention,  that  th.  T-lunctlon.  ol  the  two^a,  -rglnal.  are 
.v.  Of  the  T-functione  of  the  related  one-way 


■arglnala . 
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Example 
Hospital  Data 

The  data  used  are  from  the  field  of  hospital 
administration  and  relate  to  the  matter  of  innovation 
in  hospitals.  We  begin  with  the  assumption  that  the 
use  of  electronic  data  processing  (EDP)  in  hospitals 
in  the  late  1960 's  was  innovative.  This  assumption  is 
substantiated  by  a  variety  of  surveys  of  the  use  of  EDP 
in  hospitals,  Hammon  et  al.  (1972).  On  this  basis  the 
data  in  a  survey  of  hospitals  using  EDP  conducted  by 
Herner  and  Co.  were  combined  with  data  from  the  Guide 
Issue  of  Hospitals  for  the  same  period  so  that  a  file  of 
records  reflecting  characteristics  of  hospitals  and 
levels  at  which  EDP  was  used  by  these  hospitals  was 
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created.  The  hospitals  in  this  survey  were  selected 
by  stratified  sampling.  The  stratification  (fixed 
variable)  was  on  the  basis  of  hospital  size.  All 
hospitals  in  the  large-size  category  (200  or  more  beds) 
were  included  in  the  survey  and  a  ten  percent  sample 
was  taken  of  those  in  the  small  size  category.  The 
data  from  these  files  were  tabulated  and  arranged  in 
multiway  contingency  tables.  The  analysis  of  the 
tables  for  the  large  and  small  hospitals  will  be  des¬ 
cribed  here  and  interrelated.  See  Kuilback  and  Reeves 
11974) . 


On  the  basis  of  these  analyses  we  conclude  that 
there  is  a  distinct  relation  of  innovation  on  location 
and  length  of  stay  with  a  common  factor  for  large  and 
small  hospitals.  The  association  (measured  by  the 
logarithm  of  the  cross-product  ratio)  between  use  of 
EDP  and  length  of  stay  is  the  same  for  the  large  and 
small  hospitals.  The  log-odds  (logit)  of  use  of  EDP 
in  descending  order  of  magnitude  within  the  large  hos¬ 
pitals  and  within  the  sma3 1  hospitals  are  parallel  in 
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terms  of  the  combinations  of  the  factors  location  and 
length  of  stay.  The  usage  of  EDP  is  generally  greater 
in  the  large  hospitals  than  in  the  small  hospitals 
except  that  the  best  log-odds  for  the  small  hospitals 
is  greater  than  the  poorest  log-odds  for  the  large 
hospitals . 


In  a  study  to  identify  characteristics  which  dis¬ 
tinguish  hospitals  which  use  EDP  from  those  which  do 
not,  that  is,  to  identify  characteristics  which  are 
significantly  associated  with  use  of  EDP,  data  on  1176 
hospitals,  923  large  and  253  small,  were  collected  with 
respect  to  use,  location,  and  length  of  stay.  The  data 
appear  in  the  two  three-way  2x2x2  contingency  tables  1 
and  2.  In  order  to  determine  the  relation  among  the 
free  variables  use,  location  and  length  of  stay,  index¬ 
ed  by  size  of  hospital,  and  interactions  that  may  exist 
among  these  characteristics  it  seems  intuitively  clear 
that  an  analysis  based  only  on  two-way  tables  would  not 
suffice . 
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We  shall  denote  the  occurrences  in  the  observed 
tables  1  and  2  respectively  by  x(ijk),  y(ijk)  with 
i=l,  user;  i=2,  non-user 
j=l,  urban;  j=2,  rural 
k=l,  short;  k=2,  long. 

The  proposed  procedure  provides  estimates  for  the 
original  data  analogous  to  a  regression  procedure 
using  sets  of  observed  marginals  as  explanatory  var¬ 
iables  and  we  shall  try  to  find  an  estimate  which 
does  not  differ  significantly  from  the  observed 
data.  The  set  of  acceptable  estimates  will  indicate 
the  nature  of  the  significant  interactions  for  which 
we  can  compute  numerical  measures. 

As  a  first  step  in  the  analysis  we  shall  find 
"smoothed"  estimates  of  the  original  data.  We  shall 
do  this  for  the  large  hospitals  also  even  though  the 
data  for  all  large  hospitals  was  collected.  We  ex¬ 
amine  the  minimum  discrimination  information  estimates 
obtained  by  a  convergent  iterative  algorithm  starting 
fcith  a  uniform  table  and  successively  adjusting  for 
sets  of  observed  marginals.  It  turns  out  that  the  sets 
of  two-way  marginals  are  best  and  the  resultant  esti¬ 
mates  provide  a  satisfactory  fit.  The  estimated  tables 
have  the  same  two-way  and  also  the  same  one-way  margin- 
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aln  as  the  original  tables  .  These  estimates 

which  we  denote  by  x$(ijk),  y^(ijk)  respectively  for 

the  large  and  small  hospitals  are  given  in  tables  3 
and  4  and  imply  no  second-order  (three-factor)  inter¬ 
action.  Note  that  the  estimate  for  the  observed 
y (122)=0  is  y*  (122)=0.137. 

The  estimates  are  given  analytically  by  the  log- 
linear  representation  of  an  exponential  family 

x* (ijk) 

£nHiTTjkT  *  L+V1(ijk)+T2T2(ijk)+T3T3(ijk) 

(1) 

+  T4T4(ijk)+x5T5(ijk)+T6T6(ijk) 

where  n=nix(ijk)»  *  (i  jk)=l/2  *2  *2  ,  L  is  a  normalizing 
constant,  the  taus  are  main-effect  and  interaction  par¬ 
ameters,  and  the  T(ijk)  are  a  set  of  linearly  indepen¬ 
dent  random  variables,  in  this  case  the  indicator  func¬ 
tions  of  the  respective  marginals.  A  similar  represen¬ 
tation  holds  for  yjji(ijk).  The  log-linear  representations 

are  shown  graphically  in  Fig.  1  .  The  values  in 

the  various  columns  of  Fig.  1,  zeros  or  ones,  are  the 
values  of  the  respective  functions  T(ijk).  Note  that 
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T4  (i jk ) =T1 (ijk)T2 (ijk) ,T5 (ijk)=T1 (i jk) T3 (i jk) , 


T6(ijk)=T2(ijk)T3(ijk). 

To  test  the  goodness-of-fit  of  the  estimates  we 
compute  the  statistics  [3,4] 

21  (x:x*)=2£££x (i jk) In (x (i jk)/x£ (i jk) )=0 . 481 ,  1  D.F. 


21  (y  (ijk)-dn  (y  (ijk)/y£  (ijk)  )=0.294  ,  1  D.F. 

Since  the  statistics  are  asymptotically  distributed  as 
2 

X  we  conclude  that  the  "smoothed"  values  x£,y*  are  900<* 

estimates  and  we  shall  use  them  in  our  subsequent  analy¬ 
sis. 

From  the  log-linear  representation  (1)  or  the  graphi¬ 
cal  presentation  in  Fig.  1,  we  find  that  the  log-odds 
or  logits  of  the  use  of  EDP  for  large  hospitals  is  gi¬ 
ven  by  the  parametric  representation 
x*(lll) 

£nx*(ill)  =  T1  +  t4  +  t5 
x*(112) 

£nx*(2li)  =  T1  +  x4 

(2) 

x*(121) 

£nx*(22l)  T1  +  t5 


x*  (122) 

*nx*(i22)  =  T1 
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where  the  values  of  the  parameters  for  the  estimate 
x^(ijk)  are  found  to  be 

t1  =  -1.4842,  t4  =  0.5113,  t5  =  1.5103. 


From  (2)  we  also  see  that  for  the  large  hospitals 


T 


x*(lll):c*(221) 

x*(m)x*(12i) 


£nx*(112)x*(222) 

x|  (Ti'5Tx*TT^7T 


0.5113, 


that  is,  the  association  between  usage  and  location 
for  either  short  or  long  stay.  Similarly 


x*(lll)x*(212)  x* (121)x*(222) 

V^x^mTx*  rinr£nx*  um  x*  nrrr1  • 5103 ' 

that  is,  the  association  between  usage  and  stay  for 
for  either  urban  or  rural  location. 

For  the  small  hospitals  the  log-odds  or  logits 


are 

y%  (HD 

£ny|TOTT  =  T1  +  T4  +  T5 

y%  (112) 

toy*(212)  =  T1  +  t4 


y£ (121) 

£ny[imy  =  Ti  +  t5 


y$(l22) 

£ny*12T2‘)  =  T1 
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where  the  values  of  the  parameters  for  the  estimate 
y*(ijk)  are  found  to  be 


t,  =  -3.3357,  t 4  =  1.3088,  t5  =  0.9836. 

For  the  small  hospitals  we  also  have 

y*(lll)y*(221)  y* (112)y* (222) 
t4=  £n  y^iillyjaTlT^yJTSliiyja^r1*3088, 


that  is,  the  association  between  usage  and  location  for 
either  short  or  long  stay.  Similarly 


y*(lll)y*(212)  y* (121 ) y* (222) 

T  5' “■ ^yfT2TT7y|irar; tn  y^irTyfimT0  *9836 ' 


that  is,  the  association  between  usage  and  stay  for 
either  urban  or  rural  locations. 

Since  the  data  for  the  large  hospitals  reflect 
observations  over  all  such  hospitals,  it  will  be  of 
interest  to  determine  whether  there  exists  a  suitable 
estimate  for  the  small  hospitals,  other  than  y^(ijk), 

which  will  have  some  of  its  interactions  (associations) 
the  same  as  the  corresponding  values  for  the  large 
hospitals.  This  can  be  accomplished  by  using  the  iter¬ 
ative  algorithm  fitting  various  subsets  of  marginals 
of  y*(ijk)  (or  the  original  y(ijk))  but  starting  with 

a  distribution  which  has  the  same  tau  parameters  as 
x*(ijk).  The  tau  parameters  of  x*(ijk)  not  affected  by 
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the  iterative  fitting  procedure  will  be  "inherited" 
by  the  resultant  estimate.  We  shall  use  the  table 
v(ijk)=(253/923)x£ (ijk)  which  has  the  same  tau  para¬ 
meters  as  the  x^(ijk)  table  with  total  adjusted  to  be 

the  same  as  the  observed  total  of  small  hospitals. 

We  summarize  the  procedure:  starting  the  iterative 
fitting  algorithm  with  v(ijk)  (recall  that  y(ijk)  and 
y^(ijk)  have  the  same  two-way  and  one-way  marginals) 

Tau  parameters 
"inherited" 


Marginals  fitted 

Estimate 

from  v(ijk) 

a) 

y (i.k)  ,y  (.  jk) 

u* (ijk) 

T4 

b) 

y(ij.)  »y(.  jk) 

u*(ijk) 

T5 

c) 

y  (ij.) ,y(i.k) 

u*  (ijk) 

T6 

d) 

Y (. jk) ,y  (i. . ) 

ud (ijk) 

T4'T5 

e) 

y (i.k) ,y  (. j  . ) 

u|(ijk) 

T4'T6 

f) 

y (ij •)  ,y  (.  .k) 

u*  (ijk) 

VT6 

g) 

y (i. . )  ,y  ( .  j . )  ,y  ( .  ,k) 

u* (ijk) 

T*'TR'Tfi 

In  order  to  test  whether  the  u*  estimates  differ  sig¬ 
nificantly  from  the  y£  estimates,  that  is,  whether  the 

interaction  parameters  in  yj  differ  significantly  from 

the  interaction  parameters  in  u*  "inherited"  from  x* 
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or  v,  we  compute  the  statistic 


21  =  (ijk)  In  (y£  (i jk) /u* (i jk) ) 

2 

which  is  asymptotically  distributed  as  x  with  1  D.F. 
for  m=a,b,c,  2  D.F.  for  m=d,e,f,  3  D.F.  for  m=g. 

The  only  case  which  yielded  a  non-significant 
value  was  u£(ijk)  for  which 

2l(y*:u*)  =  0.408,  1  D.F. 

The  values  of  u*(ijk)  are  given  in  Table  5. 

The  log-linear  representation  for  u£(ijk)  in  terms 

of  v(ijk)  is 
ub (ijk) 

lnv  (1'jky  =  L+TiTi<i3k)+T2T2  (ij^)+T3T3(i3k) 

(3) 

+t4t4  (ijk)+TgT6  (ijk) 


Note  that  x ^  does  not  appear  explicitly  in  (3) .  By 

using  the  log-linear  representation  for  v  (ijk)  itself 
we  also  get  the  reparametrization  or  log-linear  repre¬ 
sentation  for  u£(ijk)  in  terms  of  the  uniform  distri¬ 


bution 


u*(ijk) 

^mr(ijk)  *  L+x1T1(ijk)+T2T2(ijk)+T3T3(ijk) 

(4) 

+t4T4 (ijk)+x5T5 (ijk)+TgT6 (ijk) 

We  remark  that  the  numerical  values  of  the  taus  in  (3) 
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The  log-odds  or  logits  of  the  use  of  EDP  for  small 
hospitals  may  now  be  given  by  the  parametric  represen¬ 


tation 


u*(lll) 

uJTJTTT 

u-(112) 

lu*(21I) 

u*(121) 

lu£(22I) 

u*(122) 


1.5103 


For  the  small  hospitals  we  now  have  the  associations 
u* (111) (221)  u*(112)u*{222) 

_  _  f _  D _ » _ D _ T) _ .  ^  ic  i 


u£(lll)u*(212)  u£il21)u*(222) 


1.5103 


location  for  the  small  hospitals  is  still  different 
from  that  for  the  large  hospitals,  but  that  the  asso 
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elation  between  usage  and  stay,  t^,  is  now  the  same  for 

both  large  and  small  hospitals . 

Arranging  the  log-odds  of  usage  in  descending 
order  of  magnitude  within  the  large  hospitals  and  with¬ 
in  the  small  hospitals  we  find 


Large  hospitals 


Fac  -ors 


Small  hospitals 


In 


x*(lll) 

*jt mr 


0.5374  Urban, Short 


v* (121) 

£nx*(mr  0,0262 


Rural, Short 


u*(lll) 


£nu*(mr-1,0111 


u£  (121) 

£nuJTOTT"2,3466 


xi  (112) 


Urban, Long 


u*(112) 

£nu*  (212  )*~~*2, 5214 


xi  (122) 

ln._±  m-.hh--1.4841 


Rural, Long 


u*(122) 

£nu*  (222) rs”3, 8569 
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Table  3 

Large  Hospitals  x£ (ijk) 

Urb« 

in 

Rural 

Short 

Long 

Short 

Lonq 

User 

374.305 

41.694 

53.695 

13.306 

483.000 

Non-user 

218.693 

110.308 

52.307 

58.692 

440.000 

592.998 


152.002 


106.002 


71.998 


923.000 


Example  7.  Partitioning  using  OUTLIERS 


Outliers  are  observations  In  one  or  more  cells  of  a  contingency 
table  which  apparently  deviate  significantly  fron  a  fitted  model.  These 
outliers  aay  lead  one  to  reject  a  model  which  fits  the  other  observations. 

In  other  cases  even  though  tt  aodel  seems  to  fit,  the  outliers 
contribute  much  more  than  reasonable  to  the  measure  of  deviation  between 
the  data  and  the  fitted  values  of  the  aodel.  In  other  words,  the 
outliers  make  up  a  large  percentage  of  the  "unexplained  variation" 

2I(x:x*) . 

A  clue  to  possible  outliers  is  provided  by  the  output  of  the 
computer  program.  In  the  computer  output  for  each  estimate  five  entries  are 


listed  for  each  cell.  The  fourth  of  these  is  titled  OUTLIER  and  its 

numerical  value  provides  a  lower  bound  for  the  decrease  in  the  corre¬ 
ct 

sponding  2I(x:x  ),  if  that  cell  were  not  included  in  the  fitting 

procedure.  Since  the  reduction  in  the  degrees  of  freedom  is  one  for  each 

omitted  cell,  values  of  OUTLIER  greater  than  say  3.5  are  of  interest.  The 

* 

basis  for  the  OUTLIER  computation  and  interpretation  follows.  Let  x^ 

denote  the  minimum  discrimination  information  estimate  subject  to  certain 

* 

marginal  restraints.  Let  x,  denote  the  minimum  discrimination  infor- 

b  * 

mation  estimate  subject  to  the  same  marginal  restraints  as  x  except 

*  a 

that  the  value  x(m^)  ,  say,  is  not  included,  so  that  x^(oj^)  m  x(ui^)  . 
The  basic  additivity  property  of  the  minimum  discrimination  information 
statistics  states  that 

2I(x:x*)  -  2I(x£:x*)  +  2I(x:x£) 
or 

2I(x:xa)  -  2I(x:x^)  *  2I(x^:xa)  . 

These  results  are  summarized  in  the  Analysis  of  Information  Table. 

TABLE 

ANALYSIS  OF  INFORMATION  TABLE 

Component  due  to  Information  D.F. 

2I(x:x*)  N 

a'  a 

:  Same  as  but  omitting  x(w^)  2I(x£:xa)  1 

2I(x:xJ)  -  N  -  1 

But 

*  * 

2I(xb:xa) 

(i;< 


and  using  the  convexity  property  which  implies  that 


■  /v“l 


*  *  .  v 

X.(u>  )  * 

)  in  -  +  I  x^co) 

x  (a,)  n-u>, 

a  1  1 


*>>/ 


x(u  )  *  x£(w) 

2(x(u)1)  in  — -  +  l  x^Cw)  in 


x  (uj.)  0-uj, 

a  l  1 


xa(“^ 


H  : 
a 
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-  3  - 


(2) 


Z  x£(io) 


/  *  . 

(n  -  xb(w1))  in 


n  “  *b(ui) 


n  -  xa(<V 


> 


The  last  value  can  be  computed  and  Is  listed  as  the  OUTLIER  entry  for  each 

A 

cell  of  the  computer  output  for  the  estimate  Xfl  . 

The  ratio 

2I(x:x*)  -  2I(x:x£)  2I(x£:x*) 

21 (x:x*)  2I(x:x*) 

3  8L 

then  indicates  the  percentage  of  the  "unexplained  variation"  due  to  the 
outlier  value. 

Thla  property  le  also  utilised  In  the  next  example.  See  Ireland 
(1972)  and  Ireland  and  Rullback  (1974)  for  further  diacuasion  and 
application.  .  \  -  i  i 

ii 


ju. 


.'12^ 
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Partitioning  Using  Outliers 


We  shall  uso  the  OUTLIER  feature  of  the  CONTAB 
program  to  partition  a  2x7  table  into  homogeneous  seg¬ 
ments  . 

Table  la  presents  data  on  leukemia  cases  observed. 
Denoting  the  entries  in  the  observed  table  by  x(ij), 
i=l,2,  j=l,2,...,7  we  first  test  whether  the  incidence 
of  leukemia  is  homogeneous  over  the  doses  by  fitting 
the  marginals  x(i.),  x(.j).  The  corresponding  output 
is  shown  in  Table  II.  We  observe  that  large  OUTLIER 
values  are  associated  with  values  of  j=l,2,6,7  and  that 
21 (x:x*)  =  44.65,  6D.F. 

Since  the  doses  are  arranged  on  a  scale  we  repeat 

the  process  omitting  the  cells  corresponding  to  x(ij) , 

i=l,2,  j=6f7.  The  corresponding  output  is  shown  in 

Table  III.  We  observe  that  a  large  OUTLIER  value  is 

* 

associated  with  j=3  and  that  2l(x:x  )  =  18.92,  t  D.F . 

We  continue  the  process  using  the  original  cells 
corresponding  to  j=3,4,5.  The  computer  output  is  given 

in  Table  IV.  Now  there  are  no  large  OUTLIER  values  and 

* 

2I(x:x  )  =  0.09,  2  D.F.  For  the  original  cells  with 

j=6f7  the  computer  output  is  given  in  Table  V  and  again 

* 

there  are  no  large  OUTLIERS  and  2I(x:x  )  =  0.37,  1  D.F. 
For  the  original  cells  with  j=l,2  the  computer  output  is 
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given  in  Table  VI  and  again,  there  are  no  large  OUTLIERS 
and  2~(x:x  )  =  0.91,  1  D.F. 

We  may  summarize  in  the  Analysis  of  Information  Tables. 


Component  due  to 

Information 

D.F. 

cells  j=l , . . . , 7 

21 (x :x  ) 

= 

44.649 

6 

omit  cells  j=6,7 

*  ★ 

21 (x  :x  ) 

a 

= 

25.734 

2 

cells  j-1 , . . .  5  ^ 

2l(x:x_  ) 

a 

= 

18.915 

4 

omit  cells,  j=l,2 

2l(xb*:xa*) 

= 

18.826 

2 

cells  j=3,4,5  ^ 

2I(x:Xj3  ) 

= 

0.089 

2 

2l(x:x  ) 

44.649 

6 

omit  cells,  j=l,2,3,4,5 

*  * 

2l(xc  : x  ) 

= 

44.283 

5 

cells  j=6  ,7  ^ 

2I(x:xc  ) 

= 

0.366 

1 

21 (x:x*) 

. 

44.649 

6 

omit  cells  j=3,4,5,6,7 

*  it 

21 (x^  :x  ) 

= 

43.740 

5 

cell  j=l ,2  V 

2l(x:x^  ) 

= 

0.909 

1 

N/ 

V 

V 


Note  that  x 

d 


Note  that  x, 
b 


Note  that 


(ij)  =  x(i. )x( . j)/n,  i=l , 2 , j=l , 2  , . . 
(ij)  =  x (i j ) ,  i=l , 2  ,  j  =  6 , 7 

(ij)  =  x(i.)x(.j)/n,  i=l , 2 ,  j=3,4,5 
(ij)  =  x (i j )  ,  i=l ,  2  ,  j  =  l , 2 , 6 , 7 

(ij)  =  x(i.)x(.j)/n,  i=l ,  2 ,  j  =  6,7 
(ij)  =  x (i j ) ,  i=l ,  2  ,  j  =  l ,  2 , 3 , 4 , 5 


5 
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^  Note  that  (ij)  =  x(i.)x(.  j)/n,  i=l,2,  j=l,2, 

xd*(ij)  =  x(ij)  ,  i=l ,2 ,  j=3,4,5,6,7 

We  now  define  an  overall  estimate  by 

xe*(ij)  =  xd*(ij),  i=l,2,  j=l , 2 
*  * 

xe  (ij)  =  xb  (ij),  i=l,2,  j=3 , 4 , 5 
xg  (ij)  =  xc  (ij),  i=l,2,  j*6 , 7 

and  we  have  for  the  associated  min-discrimination  inform¬ 
ation  statistic 

2l(x:xe*)  =  1,364,  4  D.F . 

The  values  of  xg  (ij)  are  given  in  Table  lb. 

The  data  of  Table  la  comes  from  Sugiura,  N.  and 

Otake,  M.  (1973).  Approximate  distribution  of  the  maximum 
2 

of  c-1  X  statistics  (2x2)  derived  from  2xC  contingency 
table.  Communications  in  Statistics  1(1),  9-16.  We 
arrived  at  the  same  partitioning  by  a  different  approach. 
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Example  8.  Respiratory  data.  This  example  deals  with  two  three-way 
9x2x2  contingency  tables  which  are  essentially  aarglnal 
tables  of  a  higher  dimensional  table,  not  available  to  us, 
listing  data  on  respiratory  symptoms  strong  a  group  of 
British  coal  mlnurs.  It  Illustrates  the  use  of  OUTLIER  to 
partition  second-order  Interaction  in  a  three-way  contingency 
table.  Also  Illustrated  are  multivariate  logit  analysis  and 
the  relations  among  the  parameters  Implied  by  logit  linearity. 
The  generalised  iterative  scaling  algorithm  of  Darroch  and 
Ratcliff  (1972)  Is  used  to  obtain  the  m.d.l.  estimates  under 
the  hypothesis  of  logit  linearity. 
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EXAMPLE  -  RESPTTiATGRY  DATA 


This  example  deals  with  two  three-way  contingency  tables  arising 
from  respiratory  symptoms  .  \ong  the  same  group  of  British  coal  miners. 

The  analyses  progressively  consider  more  complex  hypotheses  because  of 
basic  differences  in  certain  properties  of  the  two  sets  of  data.  Among 
other  features  the  example  illustrates  a  test  of  the  hypothesis  of  no 
second-order  interaction  in  a  three-way  contingency  table,  multivariate 
logit  analysis,  and  the  partitioning  of  second-order  Interaction  in  a 
three-way  contingency  table. 

The  techniques  are  based  on  the  principle  of  minimum  discrimination 
information  estimation,  the  associated  log-linear  representation  and 
analysis  of  information  tables  (see  Ku  et  al.  1971,  Kullback  1959,  pp. 
36-54,  155-186;  1970).  The  computational  procedures  for  this  example 
utilized  the  Deming-Stephan  iterative  marginal  fitting  algorithm  and  its 
extension  to  general  linear  constraints  by  Darroch  and  Ratcliff  (1972). 
Since  our  m.d.i.  estimates  are  constrained  to  satisfy  certain  linear 
relations  based  on  observed  values,  they  are  maximum  likelihood  estimates 
and  the  associated  m.d.i.  test  statistics  are  log-likelihood  ratio 
statistics.  The  log-linear  model  has  been  discussed  in  many  papers  and 
further  references  may  be  found  in  Dempster  (1971),  Cokhale  (1971),  Ku 
et  al.  (1971),  Plackett  (1969). 

In  Grizzle  (1971)  a  model  developed  by  Grizzle,  Starmer,  and  Koch 
(1969)  is  specialized  to  the  case  of  fitting  models  to  correlated  logits. 
Grizzle  (1971,  p.  1060)  says,  "Unfortunately  a  test  of  the  goodness-of-fit 
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of  che  logic  model  to  the  joint  response  data  has  not  been  developed." 

For  Its  methodological  Interest,  we  first  consider  the  problem  as 
presented  by  Grizzle  (1971)  from  the  minimum  discrimination  Information 
estimation  approach.  Our  results  (maximum  likelihood)  are  numerically 
in  close  agreement  with  those  of  Grizzle  (BAN) ,  but  also  include  estimates 
of  the  cell  entries  under  the  logit  model  and  a  test  of  the  goodness-of-f it 
to  the  joint  response  data. 

In  Table  1  Is  given  a  9x2x2  contingency  table  of  coal-miners 
classified  as  smokers  without  radiological  pneumoconiosis,  between  the 
ages  of  20  and  64  years  inclusive  at  the  time  of  their  examination, 
showing  the  occurrence  of  breathlessness  and  wheeze  over  nine  age 
groupings.  We  denote  the  observed  frequency  in  any  cell  by  x(ljk)  with 


Variable 

Index 

1 

2 

3 

4 

•  •  • 

9 

Age  Group 

A 

i 

20-24 

25-29 

30-34 

35-39 

•  •  « 

60-64 

Breathlessness 

B 

i 

yes 

no 

Wheeze 

W 

k 

yes 

no 

These  data  are  discussed  and  analysed  from  a  different  point  of  view  by 
Ashford  and  Sowden  (1970) ,  Mantel  and  Brown  (1973). 

A  log-linear  representation  of  the  observed  values  x(ijk)  in 
Table  1  is  given  in  columns  1-36  of  Fig.  1.  The  representation  in 
Fig.  1  is  a  graphic  presentation  of  the  design  matrix  of  the  complete 
log-linear  regression 

l"  KiuJkT  ■  L  +  Tft<1Jk>  +'”+  TX'«k)  + 

(1)  +  T^djk)  +  T^T^(ijk)  +...+  T^J(ijk)  +  X^T^(ijk) 
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+  ...+  T^TjJajk)  +  Tj“l““(ljk)  +  T*J“T““(lJk) 


—  «<‘Jk)  • 


where  l!(ijk)  ■  l/9x2x2,  n  is  the  total  number  of  observations,  L  is  a 
normalizing  factor  (the  negative  of  the  logarithm  of  a  moment  generating 
function)  and  the  T(ijk)  are  linearly  independent  indicator  functions 
(explanatory  variables)  taking  on  the  values  given  by  the  columns  of 
Fig.  1  and  whose  mean  values  are  the  various  marginals. 

Since  Grizzle  (1971)  is  concerned  with  the  marginal  logits  of 
breathlessness  and  wheeze,  this  means  implicitly  that  one  is  concerned 
witli  the  minimum  discrimination  information  estimate,  or  log-linear 
representation,  obtained  by  fitting  tiie  marginals  x(ij.)  and  x(l.k). 

If  we  danote  this  estimate  by  x*(ijk),  then  its  log-linear  representation 
or  design  matrix  is  given  by  columns  1-27  of  Fig.  1.  It  may  be  verified 
that  x*  has  the  explicit  form  x*(ijk)  «  x(ij  .)x(i.k)/x(i. .)  and 
consequently  we  have  the  marginal  logits 


ii 


in 


x*(ilk) 

x*(i2k) 

x*(ijl) 

x*(ij2) 


■  in 


-  in 


x(ll, )x(l.k)x(i. ■)  w  x(il.) 

x(i. .)x(12.)x(i.k)  x(12.) 

x(ij.)x(i.l)x(i..)  m  x ( 1  •  11 

x(i. .)x(ij .)x(i.2)  x(i.2) 


(breathlessness) 


(wheeze)  . 


Hi?  values  of  in(x(il.)/x(i2.))  and  in(x(i.l)/x(i.2))  are  given  in 
Grizzle  (1971,  p.  1060)  and  the  values  of  xj(ijk)  are  given  in  Table  2. 


From  Fig.  1  we  have  the  parametric  representation 


x*(ilk)  bAB 

in - ■  t.  +  T. .  ;  in  - 

x*(i2k)  11  x*(>J2) 


Xd<tJ1)  W  AW 
T1  Til 


1*1,2, ... ,8 
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-  4  - 


x*(91k)  R 

- - tb 

x*(92k) 


x*(9jl) 

x*(9j2) 


The  values  of  Che  parameters  in  the  parametric  representation  of 
the  logits  are 

tJ  -  -  0.3196,  -  -  0.2263,  and 


AB 

Til 

AW 

Til 

1 

-  4.4762 

-  2.6512 

2 

-  3.6872 

-  2.3380 

3 

-  3.0106 

-  1.8714 

4 

-  2.4191 

-  1.6241 

5 

-  1.8993 

-  1.1955 

6 

-  1.4214 

-  0.8840 

7 

-  0.7823 

-  0.5713 

8 

-  0.4394 

-  0.3466 

9 

0 

0 

In  particular.  Grizzle's  objective  was  to  calculate  two  lines 
relating  the  marginal  logits  to  age,  that  ic,  to  estimate  and  test  the 
hypothesis 

x*(ilk)  x*(ijl) 

in - -  ot  +  18.  ;  f.n - ■  a_  +  18,,  i*l, . .  .,9. 

xj(i2k)  1  1  x*(ij2) 


But  this  hypothesis  implies  that  the  first-order  differences  in  logits 

across  age  groups  is  constant,  or  in  view  of  the  parametric  representation, 

that  the  first-order  differences  in  the  effect  parameters  are  constant. 

AB  AW 

These  chains  of  equalities  permit  us  to  express  the  parameters  ,  T  ^ 

.  _  .  _AB  ,  AW 

in  terms  of  and  as 
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AB  9^1  AB  AW  !M.  AW 

Til  8  11  *  il  8  T11  * 


I 

1*1, ... ,8. 


These  relations  among  the  parameters  mean  that  in  the  log-linear 
representation  the  terms 

...  T^(ijk)  +  T^T^(ijk)  +...+  TjjT^(ijk)  ... 


reduce  to 


Tn(Tii(ijk>  +  lTn(ijk)  +  |T^(1Jk)  +•••+ 

and  the  terms 

...  T^J(ijk)  +  T^T^(ijk)  +...+  Tg^(ijk)  ... 


reduce  to 


Tll(Tn(ijk)  +  +  +•••+  ^(iJk)) 


If  we  denote  the  estimate  satisfying  logit  linearity  by  x*  then  its 

m 

design  matrix  or  log-linear  representation  is  given  by  Columns  1-11. 

37,  38  of  Fig.  1,  where  we  use  and  respectively  instead  of 
AB  .  AW 

Tu  *nd  Tir 


The  values  of  x*  were  determined  using  the  generalised  iterative 
m 

scaling  procedure  of  Darroch  and  Ratcliff  (1972)  subject  to  the 


x*(i. .)  -  x(l. .) ,  x*(.J.) 
to  in 


x( . j .) »  x*(..k) 
m 


x(. .k) , 


constraints 


Ill  x* 

8  m 


3  8  8 

»!«!.)-  I  x(il. )  t  I  ^  xj<l.l)  -  l  ^x(l.l) 


The  values  of  x*(ijk)  are  given  in  Table  3.  The  values  of  the  tau 
m 


parameters  appearing  in  the  linear  model  of  the  logits  are 


-  0.2098,  - 


-4.0996,  x”  -  -0.1841,  xAW 


-2.6068. 


The  corresponding  values  of  the  logit  representation  in  terms  of  the 
a's  and  S's  as  used  by  Grizzle  (1971)  are  obtained  from 


“l  +  961  "  T1 


„  B  AB 
“l  +  ^1  “  T1  +  T 


a2  +  9^2  "  T1 


,  o  W  AW 
a2  +  B2  *  Ti  +  T 


ax  -  -4.8219,  Bx  -  0.5125,  a2  -  -3.1167,  P2  -  0.3259. 

We  also  note  that 

Var(a1)  -  Var(xJ)  +  (81/64)  Var(xAB)  +  (18/8)  Cov(x®,  t^) 
Var(81)  -  (1/64)  Vard^) 

Var(a2)  ■  Var(x^)  +  (81/64)  Var(xAW)  +  (18/8)  Covd^,  xAW) 
Var(B2)  -  (1/64)  Var(xAW). 

The  variance-covariance  matrix  of  the  taus  for  x*  is  obtained  as 

m 

follows  (a  weighted  version  of  the  procedure  used  in  Kullback  1959, 
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I 


p.  217  ).  Compute  S  ■  T'DT  where  T  is  the  design  matrix  for  the  log-linear 

representation  of  x*  (columns  1-11,  37,  38  of  Fig.  1),  and  D  is  a  diagonal 
m 

matrix  whose  entries  are  the  values  of  x*(ijk)  in  the  order  of  the  rows 

m 

of  the  design  matrix.  Partition  the  matrix  S  as 


S11  S12  \ 
S21  S22  / 


where  is  lxl. 


Then  the  variance-covariance  matrix  of  the  taus  is  (Sj2  "  ^21  Slt  S12^ 
For  comparison  we  list  the  values  as  given  by  Grizzle  (1971)  and 


computed 

from  x*. 

m 

Grizzle  (1971) 

X* 

m 

°1: 

-4.8174  +  0.0848 

-4.8219  + 

0.0835 

V 

0.5123  +  0.0124 

0.5125  + 

0.0129 

a0: 

-3.1135  +  0.0558 

-3.1167  + 

0.0549 

2 

V 

0.3253  +  0.0090 

0.3258  + 

0.0089 

The  associated  analysis  of  information  table  4  provides  a  basis 
for  tests  of  significance  and  goodness-of-flt. 

Table  4 

Analysis  of  Information 

Component  due  to _ Information _ D.F. 

Interaction  (linear  2I(x:x*)  ■  3077.154  23 

logit  model) 

Effect  2I(xJ:x*)  -  25.300  14 

Interaction  2I(x:xJ)  -  3051.854  9 

(marginal  loglte) 


168 


-  8  - 


We  infer  from  2I(x:x*)  and  2l(x:x$)  that  neither  x*  or  x*  is  a 

m  d  m  d 

good  estimate  for  the  joint  response  data,  that  is,  2I(x:x*)  (2l(x:x*p) 

is  a  measure  of  the  goodness-of-fit  of  the  linear  logit  model  (marginal 

logit  model)  to  the  joint  response  data.  2l(x£:x*)  is  a  measure  of  the 

AB  AB  AB 

effect  of  the  relationship  among  the  parameters  x^,  ...,  Xgj  and 

Tll*  21*  81  implied  by  the  hypothesis  of  logit 

linearity.  We  remark  that  x*  and  correspond  respectively  to  model  3 

m  a 

and  8  of  Mantel  and  Brown  (1973). 

We  shall  return  to  the  question  of  finding  a  model  providing  an 
acceptable  fit  to  the  joint  response  u-ita  of  Table  1  after  considering 
data  giving  the  prevalence  of  persistent  cough  and  persistent  phlegm 
amongst  the  same  group  of  miners. 

In  Table  5  is  given  a  9x2x2  cross-classification  of  the  same 
miners  as  in  Table  1,  but  showing  the  combined  prevalence  of  persistent 
cough  and  persistent  phlegm.  We  denote  the  observed  frequency  in  any 
cell  by  x(ijk)  with 


Since  Table  5  has  ttie  name  dimensions  as  Table  1  the  design 
matrix  and  log-linear  representation  in  Fig.  1  and  the  log-linear 
regression  (1)  for  the  x(ijk)  values  of  Table  1  will  be  the  same  for  the 
x(ljk)  of  Table  5  with  the  replacement  of  the  superscripts  B,  W  by  C,  P 
respectively. 


169 


-  9  - 


To  determine  the  significance  of  effects  and  whether  or  not  there 
is  second-order  interaction  we  fit  a  sequence  of  nested  models  based  on 
the  marginals 

x(i. .) ,  x(.jk) 

1^:  x(.jk) ,  x(ij  .) 

H^:  x(.Jk) ,  x(ij.),  x(i.k) 

and  denote  the  corresponding  m.d.l.  estimates  by  x*,  x*,  x*  respectively. 
We  note  that  x*  and  x*  have  the  explicit  fora  x*(ijk)  ■  x(i..)  x(.Jk)/n, 
x*(ijk)  ■  x(lj.)  x(.Jk)/x(.J .)  but  x*  cannot  be  explicitly  represented  as 
a  product  of  marginals.  is  the  null  hypothesis  that  the  Incidence  of 
cough  and  phlegm  is  homogeneous  over  the  age  groups.  is  the  null 
hypothesis  that  the  incidence  of  phlega  is  homogeneous  over  the  age 
groups  given  the  Incidence  of  cough.  Hc  is  the  null  hypothesis  of  no 
second-order  interaction.  The  columns  of  Fig.  1  Implied  for  the  design 
matrix  or  log-linear  representation  of  the  three  models  are 

a  ’  ’ 

1-19,  28, 

H  :  1-28  . 

c 

The  hypotheses  say  also  be  stated  as  implying  that  the  parameters 
corresponding  to  the  columns  of  Fig.  1  not  used  in  the  design  matrix  or 
for  the  representation  are  zero.  Analysis  of  information  Table  6 
summarizes  the  results. 
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Table  6 

Analysis  of  Information 

Component  due  to _ Information _ D.  F. 

a)  x(i. . ) ,  x(.jk)  2I(x:x*)  -  1259.090  24 

& 

b)  x(.jk),  x(ij.)  2I(x*:x*)-  1180.385  8 

2I(x:x*)  -  78.705  16 

c)  x(.jk),  x(ij.),  x(l.k)  2I(x*:x*)  ■  72.009  8 

2I(x:x*)  -  6.696  8 

c 


From  Table  6  we  infer  that  the  8  interaction  parameters 
corresponding  to  columns  29-36  of  Fig.  1  may  be  taken  as  zero.  from 
Fig.  1  we  see  that  the  parametric  representation  of  the  log-odds  or 
logits  under  the  model  of  no  second-order  interaction  are 


x*(ill) 

in - 

x*(i21) 


TC  +  TAC 
T1  ll 


+  T 


CP 
11  • 


x*(il2) 

In  -C- - 

x*(I22) 


TC  +  XAC 

1  *il 


X*(ill) 

in  — - 

x*(il2) 


/  + 

1  il 


+  T 


CP 
11  ’ 


x*(i21) 

in  — - 

x*(i22) 


+  T 


AP 
il  ’ 


i-1,2, 


,9. 


The  values  of  x*  are  given  in  Table  8. 
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Thu  values  of  the  parameters  in  the  parametric  representation  of  the 
logits  are 

r  p  rp 

-  -2.0987,  -  -2.4756,  -  3.8500,  and 


AO 

11 

XAP 

11 

1 

-1.7955 

-0.7132 

2 

-1.5083 

-0.6904 

3 

-1.1155 

-0.6729 

4 

-1.0052 

-0.5734 

1-5 

-0.5939 

-0.5473 

6 

-0.3801 

-0.4448 

7 

-0.1422 

-0.3070 

8 

-0.1103 

-0.0639 

9 

0 

0 

The  covariance  matrix  of  these  19  parameters  has  been  computed, 
but  Is  not  given  herein. 

cr 

We  mention  however  that  the  variance  of  is  0.003116  so  that 
X2  -  (3. 85)2/0. 003116  -  4756.90 


Is  approximately  a  chi-squared  with  one  degree  of  freedom.  We  see  In 


Analysis  of  Information  Table  7  a  verification  of  the  fact  that  the 


as 


CP 


Hociatlon  parameter  is  very  significantly  different  from  zero. 
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Table  7 

Analysis  of  Information 

Component  due  to _ Information _ D.F. 

e)  x(ij .)  ,  x(i.k)  21 (x: x*)  -  6273.746  Q 

c)  x(lj.),  x(l.k),  x(.jk)  2I(x*:x*)  -  6267.050  1 

2I(x:x*)  -  6.696  8 


We  remark  that  Hg:  x(lj.),  x(l.k)  represents  the  model  that 
cough  and  phlegm  are  not  associated  given  the  age  grouping.  The 
corresponding  estimate  may  be  explicitly  represented  as 


x*(ijk)  -  x(ij.)  x(i.k)/x(i. .) . 


CP 

2I(x*:x*)  tests  the  null  hypothesis  that  “  0  and  the  value  of 

2T(x:x*)  -  6.696,  8  D.F.  implies  that  the  association  between  cough 

and  phlegm  has  the  same  value  over  all  the  age  groupings. 

We  now  examine  the  hypothesis  that  the  logits  of  x*  vary  linearly 

with  age,  that  is,  that  successive  differences  of  the  logits  are 

AC  AP 

constant.  As  before  we  can  express  the  parameters  x^,  x^ ,  under  this 

AC  AP 

hypothesis  in  terms  of  x^  and  as 


H  .  AC  _  9^1  AC  AP  ,  9-1  AP 

n  Til  8  T11  ’  Til  8  11  * 


i-1 


8. 


If  we  denote  the  estimate  satisfying  logit  linearity  within  the  model  of 

no  second-order  interaction  by  x*,  then  the  design  matrix  or  log-linear 

n 

representation  corresponding  to  H  ^given  by  columns  1-11,  28,  37,  38  of 
Fig.  1,  of  course,  with  the  replacement  of  the  superscripts  B,  W  by  C,  P 
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AC  AP  AC  AP 

respectively  nnd  the  use  of  T  ,  T  instead  of  t^,  respectively 
for  convenience. 

The  values  of  x*  are  given  in  Table  9.  The  values  of  the 
n 

parameters  in  the  logit  representation  under  the  logit  linearity  model. 


£n 


x*(ill) 
n _ 

x*(i21) 

n 


Tc  +  /C  +  TCP 
T1  8  T  T11  • 


x*(il2)  9-i 

in  - -  T,  +  -Q-  T 

x*(i22)  1  8 

n 


Jin 


X*(ill) 

x*(il2) 


P  ,  9-1  _AP  .  TCP 
T1  8  T  +  T11  » 


X£(121)  P  9-i  AP 

£n  -  -  T.  +  -rr-  T  , 

x*(i22)  1  0 

n 


are 

-  -1.8939,  -  -2.5495, 

tAC  -  -1.8312,  -  -0.7646,  tJ*  -  3.8442. 


The  covariance  matrix  of  these  five  parameters  is  given  in  Table 
10.  The  associated  analysis  of  information  is  given  in  Table  11. 
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Table  11 

Analysis  of  Information 

Component  due  to  Information  D.F. 

H  21 (x: x*)  -  28.831  22 

n  n 

H  21  (x* : x*)  -  22.135  1A 

c  c  n 

21 (x :x*)  -  6.696  8 

c 


The  value  2I(x:x*)  is  a  measure  of  the  goodness-of-f it  of  the  logit 
n 

linearity  model  and  2I(x*:x*)  is  a  measure  of  the  effect  of  replacing  the 

AC  AP  AC  AP 

common  parameters  T  ,  T  by  T^,  T^,  i"l,...,8.  It  is  clear  that  x* 

provides  a  better  fit  to  the  original  data  than  x*,  using  more  parameters 

n 

however,  but  at  the  5%  level  of  significance  the  logit  linearity  model 
provides  an  acceptable  fit,  with  a  simpler  model. 

In  our  analysis  of  the  incidence  of  cough  and  phlegm  over  the  age 
groups  we  concluded  that  the  association  of  these  factors  was  the  same 
over  all  the  age  groupings.  However,  in  multidimensional  contingency 
tables  in  which,  for  example,  time  or  age  is  one  of  the  classifications, 
there  nay  occur  an  age  effect  such  that  an  hypothesis  of  interest  may  be 
rejected  for  the  entire  table,  but  an  hypothesis  taking  the  possible  age 
effect  into  account  may  produce  an  acceptable  partitioning.  We  now 
propose  to  illustrate  techniques  applicable  to  the  solution  of  such 
problems  a  further  study  of  the  9x2x2  contingency  Table  1,  containing 
nine  age  groupings,  for  which  the  hypothesis  of  no  second-order  Interaction 
is  rejected.  An  acceptable  partitioning  is  determined.  Within  the 
partitioned  model  wc  then  consider  a  subhypothesis  of  logit  linearity 
(Kullback  and  Fisher,  1973). 
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Let  us  now  find  the  estimate  under  the  classic  null  hypothesis 
of  no  second-order  interaction.  The  minimum  discrimination  information 
estimate  x*(ijk)  under  the  hypothesis  of  no  second-order  interaction 
is  obtained  by  iteratively  fitting  the  marginals  x(ij.),  x(i.k),  x(.jk) 
(see  Ku  et  al.,  1971,  for  example)  and  is  given  in  Table  12.  The 
design  matrix  or  log-linear  representation  of  x£(ijk)  is  given  by  the 
cjlurans  1-28  in  Fig.  1.  Indeed,  the  no  second-order  interaction 
hypothesis  is  that  the  values  of  the  last  eight  parameters  in  x(ijk) 
have  the  hypothetical  values 


(2) 


ABW  ABW  ABW  _ 

Tlll  “  X2U  •••  T811  “  ° 


Computing  the  associated  minimum  discrimination  information  statistic 
we  find 


2I(x:x*)  -  2[^x(ijk)  *n(x(ijk)/x*(ijk))  -  26.673,  8D.F. 

We  recall  that  this  is  the  same  as  the  log-likelihood  ratio  chi-squared 
statistic  (see  e.g.  Darroch  1962).  We  reject  the  null  hypothesis  of  no 
second-order  Interaction,  that  is,  the  hypothetical  values  in  (2)  are 
not  acceptable  parameters  for  x(ljk). 

Among  other  properties  the  null  hypothesis  of  no  second-order 
interaction  Implies  a  common  value  for  the  association  (measured  by  the 
logarithm  of  the  cross-product  ratio)  between  breathlessness  and  wheeze 
over  all  age-groups.  In  terms  of  the  parameters  defining  x*(ijk)  this 
common  value  as  determined  from  columns  1-28  of  Fig.  1  is 

x*(ill)x*(i22) 

In  — - - - -  T,"  -  2.8348,  i-l,2,...,9. 

x*(112)x^(i21)  11 
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We  sunmarize  the  results  and  supplement  analysis  of  information 


Table  4  by  analysis  of  information  Table  13  , 

Table  13 

Analysis  of  Information 

Component  due  to  Information  D.F. 

d)  x(ij • ) ,  x(i.k)  21 (x:x^)  -  3051.854  9 

Hj:  x(ij.),  x(i.k),  x(.jk)  2I(x*:x*)  -  3025.181  1 

2l(x:xJ)  -  26.673  ft 


The  value  of  2I(x*:x*)  implies  a  significant  (nonzero)  association 
between  breathlessness  and  wheeze  but  the  value  of  2I(x:x*)  leads  me  to 
conclude  that  there  is  not  a  common  value  of  this  association  over  all  the 
age  groups.  We  note  that  the  estimate  x£  corresponds  to  model  9  of 
Mantel  and  Brown  (1973) . 

It  seems  reasonable  to  conjecture  that  the  presence  of  second-order 
interaction  may  be  related  to  an  age  effect.  That  1b,  there  may  be  a 
common  value  of  the  association  between  breathlessness  and  wheeze  over  some 
of  the  younger  age  groups  and  a  common  but  different  value  of  this  associa¬ 
tion  over  the  remaining  age  groups.  We  therefore  re-examined  the  computer 
output  for  x*.  Among  other  items  there  was  given  for  each  cell  a  number 
called  OUTLIER,  the  value  of 

2 (x(ijk)  £n(x(ijk)/x*(ijk))  +  (n-x(ijk)  £n(n-x(J Ik)) /(n-x*(ijk))) . 

Ireland  (1972)  has  shown  that  large  values  of  OUTLIER  are  effective  in 
recognizing  outliers  under  the  estimation  procedure  in  question.  In 
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the  case  at  hand  the  value  of  OUTLIER  for  cell  812  was  A. 959  with  the 
next  largest  value  2.722  for  cell  212. 

Let  us  therefore  consider  a  partitioning  of  the  second-order 
interaction  for  the  age  groups  under  55  and  for  the  age  groups  55  and 
over  by  computing  the  minimum  discrimination  information  estimate 
x*(ijk)  subject  to  the  marginal  restraints  of  x*(ijk)  and  also  the 
restraints 


ABW  ABW  ABW 
...  -  I711,  t311  -  t911 


The  design  matrix  or  log-linear  representation  for  x*(ijk)  is  given  by 
columns  1-28,  39  in  Fig.  1,  that  is,  with  the  eight  columns  corresponding 


ABW  ABW 

t0  Tlll’  T211’  •••» 


T811  rePlacec*  by  the  one  column  labeled  The 


values  of  x*(ijk)  are  given  in  Table  1A.  In  terms  of  the  parameters 
defining  x*(ijk),  from  columns  1-28,  39  in  Fig.  1,  it  is  found  that 

x*(ill)x*(i22)  RU 

£n  — - - - - -  T®”  +  -  3.0007,  i-l,...,7 

x*(il2)x*(i21) 


x*(ill)x*(i22)  flw 

£n  -  -  T 

x*(il2)x*(121) 


-  2.5212,  i-8,9 


The  associated  analysis  of  information  Table  15  summarizes  results. 
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Table  15 


Component  due  to 


No  second-order 
interaction 


Analysis  of  Information 
Information 

2I(x:x*)  -  26.673 


D.F. 


8 


Effect  21 (x* :x*)  -  16.700  1 

Interaction  2l(x:x*)  ■  9.973  7 

(partition) 

We  note  that  2I(x*:x*)  which  measures  the  effect  of  the  hypothesis 
in  (3)  is  very  significant,  and  from  the  value  of  2I(x:x*)  we  nay  accept 
the  inference  that  there  is  a  common  association  between  breathlessness 
and  wheeze  for  the  age  groups  under  55  and  a  different  but  common  value 
for  the  age  groups  55  and  over  and  that  in  fact  x*(ijk)  is  a  good  fit  to 
the  original  data. 

We  remark  that,  as  a  matter  of  fact,  the  values  of  x*(ijk)  were 
computed  by  iteratively  fitting  all  the  two-way  marginals  of  the  7x2x2 
table  of  the  age  groups  under  55  and  separately  iteratively  fitting  all 
the  two-way  marginals  of  the  2x2x2  table  of  the  age  groups  55  and  over. 

To  verify  the  indication  given  by  OUTLIER  we  also  examined  the 
other  possible  "break  points"  with  the  following  results 


Partition 

2l(x:x*) 

D.F. 

Under  35 

0.612 

2 

Over  35 

15.990 

5 

Under  40 

1.856 

3 

Over  40 

11.541 

4 

Under  45 

3.311 

4 

Over  45 

8.373 

3 

Under  50 

8.420 

5 

Over  50 

7.861 

2 
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These  values  confirm  the  Inference  suggested  by  OUTLIER. 

If  we  now  consider  the  logits  for  breathlessness  and  wheeze, 
respectively,  for  the  age  groups  under  55,  from  the  design  matrix  or 
log-linear  representation  for  x*(ijk)  in  Fig.  1  (columns  1-28,  3?)  we 


see  that 


Xt(U1)  B  AB  BW  ABW  x£(i12)  B  AB 
In  -  =  T.  +  T  +  T  +  T  ;  in - -  T .  +  T  ,  i=l, 

x*(i21)  11  x*(i22)  1  11 


x*(ill) 

x*(112) 


.W  ,  .AW  L  ,BW  ,  _ABW  „  X*(l21) 

Ti  +  T<i  +  Tn  +  T  ;  - 

11  11  x*(i22) 


W  L  AW  , 
T1  +  Til* 


The  corresponding  logits  for  the  age  groups  55  and  over  are  given  by 


Xt(311)  B  AB  BW.  Xt(812)  B  AB 

“  T.  +  +  T- -  ;  in  “  T .  + 

x*(821)  1  81  11  x*(822)  1  81 


X*(011)  a 

in  -  -  T. 

x*(921) 


BW  x*(912) 

+  T„;  in  — - 

x*(922) 


x*(811)  AW  BW  X*(821)  w  AW 

*■*> - -  T.  +  T  +  T  ;  in  -  -  T  +  T„. 

x*(812)  1  J1  11  x*(822)  1  81 

c  i  c 


x*(911) 

x*(912) 


♦  T??i 

jx*(922) 


The  numerical  values  of  these  logits  are  given  in  Table  16. 

We  now  consider  the  hypothesis  that  within  the  partitioned  no 


second-order  hypothesis,  that  is,  within  the  x*(ijk)  model,  the  logits 
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are  linearly  related  for  the  age  groups  under  55,  In  other  words,  we 
consider  the  fitting  of  straight  lines  to  the  logits  for  the  age  groups 
under  55  hy  assuming  that  the  differences  rf  logits  for  successive  age 
groups  arc  constant. 

Thus  we  shall  consider  a  null  hypothesis  that 


A3 

AB 

AB 

AB 

AB 

A3 

AB 

AB 

71  " 

T61  " 

T61  " 

T  a  T  — 

51  51 

T41  " 

...  -  t21  - 

T11  ’ 

AW 

AW 

AW 

AW 

AW 

AW 

AW 

AW 

71  “ 

T61  " 

T61  - 

T51  “  T51  " 

T41  " 

at  T  — 

21 

T11  ' 

If,  as  a  matter  of  convenience,  we  consider  the  design  matrix  or 
log-linear  representation  of  x*(ijk)  as  in  Fig.  2,  that  is,  a  reparametri- 
zaticn  of  the  log-linear  representation  in  Fig.  1,  then  the  chains  of 
equalities  yield  the  relations  among  the  parameters 


T 


AB 

il 


7-1  AB  AW 
6  Tll’  il 


7-1  AW 

6  Tir 


1*1,2, ... ,7 


The  design  matrix  or  log-linear  representation  for  the  linear  logit 

Ali  AW  AJ 

model  estimate  x*(ijk),  using  T  and  T  respectively,  instead  of 

AW 

and  is  given  in  columns  1-11,  28-31  of  Fig.  2.  The  values  in 
columns  30,  31  arise  from  the  fact  that  in  the  log-linear  representation 
as  in  (1)  the  terms 


AB_AB 

TnTn 


(ijk) 


+  T21Tn(iJk)  T61T61(1Jk) 


and  the  terms 
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because  of  the  relations  among  the  parameters  reduce  to 


\. 


xAB(T^(ljk)  +  (5/6)T^(ljk)  +  (4/6)T^(iJk)  +...+  (l/6)T^(ijk) ) 


and 


TAW(TAW(ijk)  +  (5/6)TAW(iJk)  +  (4/6)^(ijk)  +...+  (l/6)TAJ(ijk)) 


respectively. 


The  iteration  used  to  compute  x*(ijk)  Is  (see  Darroch  and 


Ratcliff  1972) 


(5n+l) 


<IJk)  ■  ffefc 


x(5n)(ljk) 


(5rtf2)(1jk)  .  ;j 


(5n+1)(.j.) 


x(5nfi)(ijk) 


x(5n+3)(ljk)  . 


(5n+2)(..k) 

a. (ijk) 


x(5iH-2)(1Jk) 


*(5n+4,«jk) 


l\ _ V1  J  /h2  V 


*2(1J»/h3 


u 


1 _ \ 

(5n+3)l 


a3(ijk) 


x(5n+3)(ljk) 


x(Snf5)(l]k) 


(h  fl(lJk)/  k2  \b2(«k)/k3  \V1Jk) 

'(^y  (^y 


(5n+4) 


x^(ijk)  -  n/28,  n  -  £  l  l  x(ljk)  . 

1-1  j-1  k-1 


7  2  2 


(ijk) 
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All  marginals  refer  to  Che  7x2x2  table  and  the  values  of  a  (11k) , 

m 

b  (ilk),  m«l,2,3  and  Che  restraints  h  ,  k  ,  m*l,2,3  are  given  in  Fig.  3. 
m  mm 

'la  remark  that  since  :c*(ijk)=x£(i  jk)  for  i=d,9,  we  can  perform  the 
iteration  by  consideration  of  the  /x 2x2  table  only.  The  values  of 
x^vijk)  are  given  in  Table  17. 

'lesults  are  summarized  in  analysis  of  information  Table  13. 

Table  IS 

Analysis  of  Information 


Component  due  to 

Information 

D.F. 

Interaction 
(linear  logits) 

21 (x:x*) 

-  20.560 

17 

effect 

21 (x* :x*) 

-  10.587 

10 

Interaction 

(partition) 

2I(x:x*) 

-  9.973 

7 

Since  2l(x:x*)  and  2l(x*:x*)  fall  between  the  57:  and  27,  values 
v  tv 

of  tlie  tabulated  chi-squared  values  with  tiie  appropriate  degrees  of 

freedom,  we  might  accept  the  null  hypothesis  of  linearity  of  the  logits 

within  the  partitioned  second-order  interaction  model,  that  is,  infer 

from  the  value  of  2I(x*:x*)  that  the  parameters  X^,  x^,  ...,  and 
AW  AW  AW 

T  ,  t  ,  ...»  x  '  of  x*(ijk)  satisfy  the  relations  among  the  parameters 

11  4-1  / 1  t 

implied  by  the  logit  linearity  and  that  the  estimate  x*(ijk)  under  the 
logit  linearity  model  is  an  acceptable  estimate  for  the  original 
observations . 
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Tabic  12: 


No  second-order  Interaction  estimate 
for  the  data  of  Table  1 

x*(ijk) 


k-1 

96.448 

110.907 

185.040 

266.585 

279.467 

321.175 

250.848 

199.319 

123.210 


k=2 

1839.547 

1648.087 

1854.947 

2347.390 

1771.497 

1714.769 

1318.129 

992.729 

534.909 


-  27  - 


Table  16:  Logits 


x*(ilk)  xMijl) 

In  _E -  £n  -± - 

x*(12k)  x*(ij2) 


k-1 

k-2 

J-l 

.1-2 

1 

-2.4605 

-5.4611 

0.0455 

-2.9552 

2 

-1.7904 

-4.7911 

0.2002 

-2.7104 

3 

-1.3261 

-4.3267 

0.6806 

-2.3200 

4 

-0.8058 

-3.8065 

0.8029 

-2.1977 

i-5 

-0.4342 

-3.4848 

1.1289 

-1.8717 

6 

-0.1100 

-3.1106 

1.2942 

-1.7064 

7 

0.5303 

-2.4703 

1.2911 

-1.7095 

8 

0.6288 

-1.8925 

1.0326 

-1.4887 

9 

0.9799 

-1.5413 

1.1903 

-1.3309 

Breathlessness 


Wheeze 


Table  17:  Linear  logit  estimate  within 
partitioned  second-order 
interaction  model 

x*(ijk) 


k-1 

k-2 

k-1 

k-2 

1 

11.360 

9.990 

108.934 

1821.215 

2 

20.398 

13.952 

120.522 

1636.127 

3 

44.705 

24.830 

169.946 

1873.519 

4 

107.932 

48.677 

263.913 

2362.476 

i-5 

158.232 

57.944 

248.880 

1808.943 

6 

288.909 

85.919 

292.375 

1725.797 

7 

416.964 

100.688 

271.429 

1300.919 

8 

411.545 

146.550 

219.454 

972.450 

9 

366.455 

111.450 

137.546 

520.550 

x*(ill)x*(122) 

In  - X -  .  2.9881,  i-1 . 7 

x*(i!2)x*(i21) 
v  v 

x*(ill)x*(i22) 

In  — - - -  -2.5212,  i-8,9 

x*(il2)x*(i21) 
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Figure  2:  Log-linear  representation 


1 


5.  The  General  Linear  Hypothesis  f 

l.  Minimum  Discrimination  Information  Estimation 

■  i 

< 

In  Chapter  3,  Log-linear  Representation 
the  minimum  discrimination  information  theorem  was 

t 

examined  with  particular  emphasis  on  problems  of  fitting 

contingency  tables  based  on  a  set  of  observed  marginals.  • 

In  such  cases  the  T(#j)  functions  are  indicator  functions 

i 

and  hence  take  the  values  0  or  1  only.  In  Kullback 
(1970)  quadratic  approximations  to  the  minimum  dis- 

I 

crimination  information  statistics  were  considered 

I 

and  the  relation  of  these  quadratic  approximations 

2 

with  K.  Pearson's  X  (Berkson,  1972)  . 

We  now  propose  to  consider  problems  in  which  the 
T  (w)  functions  are  general  linear  functions  of  the 
p(w) 's.  In  these  problems  the  restraints  are  deter¬ 
mined  by  hypotheses  of  interest  and  one  is  concerned 
whether  the  observed  data  are  consistent  therewith. 

Although  these  considerations  really  are  part  of  the 
general  theory  already  discussed  it  seems  worthwhile 
to  examine  them  in  detail.  We  shall  use  the  notation, 
terminology,  and  concepts  of  the  preceding 

chapters  with  some  slight  modifications. 

Appropriate  computer  programs  have  been  prepared  to 
make  application  feasible. 
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Preceding  page  blank 


2 


i 

i 

I 

I 

As  in  Chapter  3 ,  we  want  the  ‘ 

value  of  p(w)  which  minimizes  the  discrimination 
information 

(1.1)  I(p:Tr)  =  [p(*»  ln 

over  the  family  of  p-  distributions  which  satisfy 
the  restraints  (using  matrix  notation) 

(1.2)  C£  =  0 
where 

C  is  (r+1)  x  ft,  p  is  Q  x  1,  6_  is  (r+1)  x  1,  and  the 
rank  of  C  is  r  +  1  <  !). 

If  we  denote  the  elements  of  the  matrix  C  by  Cj_(w), 
i  =  l,...,(r+l),  w  =  1,...,  ft,  then  (1.2)  is 

( 1  •  3)  £c  .  (w)p(io)  =  9-  i  i  =  1 ,  . . .  ,r+l . 

ft 

We  shall  usually  assume  c^ ( to)  =  1,  all  u>,  and  =  1. 

In  accordance  with  the  minimum  discrimination 
information  theorem,  or  by  differentiation  of  (1.1) 
with  respect  to  p(w)  and  using  Lagrange  multipliers, 
the  minimizing  distribution  has  the  form 


(1.4)  p* (w) 


exp  (l^c^(w)  +  X2c2^U)^  + 


+  X  r+lcr+l } 


TT  (to  ) 
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(1.5)  £n  A^c^(w)  +  \2^2^)  +  •••  +  •  <*>=1,  .  .  .  ,  i 


This  is  equivalent  to  the  version 


* 

(1,6)  ^-n  PF(jf  =  L  +  T1T1(°J)  +  •••  +  TrTr(w) 


as  used  in  Chapter  2,  Log-linear  Representation  with 


(1.7)  A,  =  L,  X. 


i+1  =  V  cl<“>  5  X'  ci+l(i“>  -  Vw)'  8i+x  =  ei 


i  =  1, . . .  ,r  , 


with  the  restraints 


(1.8)  £p  (a))  =  1,  It  (o))p*  (o))=  6*  o 

a  a  a  a 


=  1,  2, 


In  accordance  with  (1.3)  -  (1.8)  we  consider 
the  partitioning  of  the  matrices  as  follows : 


—  =(  c  )w^are  is  1  x  !2,  C2  is  r  x  ft  , 


8  =  (0  /where  9  is  r  x  1  > 


that  is 


Cl£  —  1 1  C^P  =  • 


In  the  applications  we  take  tt(w)  =  x(oj)/N,  where 

*  * 

N  =  ^x(^)  .  Setting  x  (w)  =  Np  (m),  the  minimum  dis- 

a 

crimination  information  statistic  is 


4 


(1.9)  2l(x*:x)  =  2j>x  (uRn  X-^-|  > 

2 

which  is  asymptotically  distributed  as  x  with  r 
degrees  of  freedom  if  the  observed  table  x(w)  satisfies 
the  null  hypothesis  or  model  implied  by  (1.2). 

In  accordance  with  the  discussion  in  Kullback 

(1970,  section  4,  and  7)  the  quadratic  approximation 

* 

to  21 (x  :x)  is  (see  Chapter  3,  section  6  herein) 


(1.10)  21  (x*  :x)  *  (N6*-N(S)  '  S^2  ±  (N0*-N£)  » 

/C,\ 

where  C^tt  =  1,  CjTT  =  6,  S  =  L/1  Dx(C^,C^) 


— 1— x— 1  — 1— x— 2 


c2dxc^  c2dxc^ 


^11  —12 


-21  -22 


'  -22.1  *  -22"  -21-11-12  * 


and  D  is  the  U  x  ft  diagonal  matrix  with  main  di- 

agonal  entries  x(<d)  .  We  shall  see  that  the  right- 

2 

hand  side  of  (1.1C)  is  the  minimum  modified  X  . 
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Some  examples  of  the  matrix  C 


Consider  a  3  x  3  contingency  table  Table 


11 

12 

13 

21 

22 

23 

31 

32 

33 

,  1 

2 

3 

4 

5 

6 

7 

8 

9 

e 

nr 

— r 

T 

~X 

"X 

~x 

~r 

'X 

“T 

I 

0 

i 

0 

-1 

0 

0 

0 

0 

0 

0 

0 

0 

l 

0 

0 

0 

-1 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

-1 

0 

0 

ii 

12 

13 

21 

22 

23 

31 

32 

33 

,  i 

2 

3 

4 

5 

6 

7 

8 

9 

e 

nr 

"X 

~X 

1 

T 

"T 

~T 

— r 

1 

I 

0 

1 

1 

-1 

0 

0 

-l 

0 

0 

0 

0 

-1 

0 

1 

0 

1 

0 

-1 

0 

0 

11  12  13 
21  22  23 
31  32  33 

Hyp.  of  symmetry 

P  (12)  =  p (21) ;  p (13)  =  p ( 31 ) ; 
p(23)  =  p (32); 


Hyp.  of  marginal  homogeneity 

p (11) +p  (12) +p (13) =p (11) +p (21) +p (31) - 
p(21)+p(22)+p(23)=p(12)+p(22)+p(32). 


Consider  a  2  x  2  contingency  table  Table 


11 

1  1 

12 

2 

21 

3 

22 

4 

8 

X“ 

r 

“X 

1 

1 

l 

i 

0 

0 

3/4 

l 

.0 

1 

0 

3/4 

Consider  a  2  x  2  x  2 


11  12 
21  22 

Hyp  of  specified  marginals 

p(ll)+p(12)  =  3/4  : 
p  (11)  +p  (21)  =  3/4. 

Implies  p(21)+p(22)=»l/4,  p(12)+p(22)=l/4- 
contingency  table  Table 


111 

1 

112 

2 

121 
_ 3 

122 

4 

211 

5 

212 

6 

221 
_ 7 

222 
_ 8 

0 

1 

1 

r 

1 

1 

1 

1 

r 

i 

1 

1 

i 

1 

0 

0 

0 

0 

1/2 

1 

1 

0 

0 

1 

1 

0 

0 

1/2 

1 

0 

1 

0 

1 

0 

1 

0 

1/2 

1  2 
11  12  11  12 
21  22  21  22 


Hyp  of  specified  marginals 
p(l.  .)»p(lll)+p(H2)+p(121)+p(122)=l/2  ) 

p(.l.)=p(lll)+p(112)+p(211)+p(212)=l/2  ; 
p(. .l)=p (111) +p (121s +p (211) +p (221) -1/2  • 

Implies  p  ( 2 . . ) =  p( .  2 . ) =  p( . .  2 )  =1/2  • 
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2.  Minimum  Modified  y2  Estimation 
We  shall  use  the  same  notation  as  before.  For 
minimum  modified  x2  estimation  we  want  the  value  of 
p(w)  which  minimizes  the  modified  x2  , 

(2.1)  Ix'2  ^  l  <P(«0  ~  tt (m ) )  2 
^  7T  (to  ) 
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4 


I 


\ 


/ 


I 


subject  to  the  constraints  (1.2)  or  (1.3). 

2 

Differentiating  x  '  with  respect  to  p(u>)  and  using 
Lagrange  multipliers  we  have 


(2.2) 


1  (ai  )  ~  TT  (o)  ) 


-  X  ^c^  ~  • 


If  we  set  £(u;)  =  (p (u> )  -n (id ) /tt  (uj )  ,  £*  =  U  (1)  (ft) ),  X'  » 

(X  1 , . . .  , X  )  ,  then  (2.2)  may  be  written  as  (matrix  notation) 


(2.3)  i  =  C'X  , 


(2.4)  E=TL  +  p*C'X, 

where  £'  =  (p(l)  , . . .  ,p(ft) )  ,  jr  •  =  (tt(1)  ,...rir(«))  ,  and 
is  the  ft  x  ft  diagonal  matrix  with  main  diagonal 
tt  (1)  . . .  ,tt  (ft)  .  If  we  set  (see  (1.10)) 


(2.1)  err  =  -  u,e« ) , 

then  from  (2.4)  we  get 


(2.6)  C(£  -  l)  =  0  -  ♦  =  CD^C'X  , 


(2.7)  X  *  (CD^C*  ) _1  (_9  -  ♦)  , 


that  is 


(2.8)  £  *  I  +  D^C'  (CD^CT  )_i(e  -  <t»)  , 


/ 


8 


with  x  =  Ng,  x  =  Ntt  , 


(2.9)  x  =  x  +  D  C'  (CD  C')  a(N6  -  N$)  , 

“  *”  A  A 


where  D  =  ND  .  Since 

—X  — 7T 


(2.10)  min  x'2  =Z  --(U  ~  =  (£"1/2  (x-x) )  '  (D~i/2  (x-x)  ) 


and  from  (2.9) 


(2.11)  D‘1/2(x  -  x)  =  dI/2C'  (CD  C*  )_1(N0  -  N£), 

— X  —  ”  X  X 


we  have 


(2.12)  min  x'2  =  (N0  -  N£)  '  (CD^C*  )  “1CD^/2D^/2C*  (C^C'  )  "1  (N0-N<|>) 


=  (N0  -  N£)  '  (CD  C*  )  ’  (NO  -  N£) 


Using  the  notation  of  (1.10)  , 

(2.13)  (CD  C1 ) ~1  =  S'1  =(-21  ^ 

"  U  —22.1 


(2.14)  (N0  -  N£)'  =  (0,  N6  -  N0 ) '  , 


hence 


,.2  _ 


(2.15)  min  X  =  (0,  N0  -  N0)  '  21  _J_ 


S11  S12 


r  §.  22.1 


N0  -  N9 


=  (N0*  -  N0)'S22  1 (N6*  -  N0)  , 


9 


that  is,  the  right-hand  side  of  (1.10).  Note 
that  if  we  use  the  approximation 

U.16)  =  *n(l  + 

then  (2.2)  is  an  approximation  to  (1.5). 
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3.  An  iterative  computer  alaorithm  -  Sincrle  Sample 

For  convenience  in  discussion  let  us  call  the  pre¬ 
ceding  discussion  the  single  sample  case,  We  shall  con¬ 
sider  an  extension  of  the  concepts  to  the  k-sample  case 
but  it  will  be  helpful  not  to  go  to  the  k-sample  case 
directly.  We  now  consider  an  iterative  computer  algorithm 

which  will  provide  the  minimum  discrimination  information 

2 

estimate  with  the  minimum  modified  x  estimate  as  a  by 
product.  The  single  sample  algorithm  is  a  special  case 
of  the  k-sample  algorithm,  but  it  will  be  helpful  to 
consider  the  single  sample  case  in  detail  (see  Dempster,  1971). 


(3.1) 


(3.2) 


£  E  = 


C  x  = 


is  1  x  ft,  £2  is  r  x  ft, 
e*  is  r  x  1  » 

A 

eisrxl,  x  is  ft  x  1 


matrix  of  observations,  N  =  ^  x(w)/ 

n 


(3.3)  D  is  ft  x  ft  diagonal  matrix  of  observations/ 


(3.4)  S  =  C  Dx  C'  = 


*11 

—12 

—21 

—22 

,  ^  is  1  x  1 ,  S 2 ^  =  2 


is  1  x  r,  Sjj  1  31  r 


(3.5)  S 


-22.1  “  -22  "  —21  -11  -12  ' 
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,  0  \  /° 

(3.6)  4  =  N£  -  N£  =  /  A 

N4*-N£l  1  d 


d  -  N£*  N£  is  r  x  1  , 

(3.7)  t(j)  =  S^2^1  -(j)'  3“ 0,1,2,.. 


Let  _in  y  denote  an  ft  x  1  matrix  and  _in  x  the  ft  x  1 
matrix  of  in  x(l)  , . . .  ,  in  x(fl) ,  where  x(l) , . . . ,x(ft)  are 
the  original  observations. 


(3.8)  (tau)  (3+1)  =  (tau)(j)  +  t(j),  (tau)  (j)  =  0  for  j=0. 


(3.9) 

(3.10) 

(3.11) 

(3.12) 

(3.13) 


j=0,l,2,...  , 


in  y 

(j)  = 

in  x  + 

(tau)  (j)  , 

j-1 

y(j) 

(D 

..,y(j)  («) 

,  j=l  ,2  , .  . 

•  f 

L(j) 

=  in 

N 

y(J)  (1)  +  . 

.  .+y (( i) 

f 

in 

x(3) 

(1)  =  L(J> 

• 

+  in  y^* 

(1) 

[  In 

x(^ 

(«)  =  L(j) 

+  in  y^ 

(«) 

xtj> 

(1)  ,. 

. .  ,x^  (n) 

,  j  =  1 ,2  , . . 

• 

In  step  (3.7),  j=0  corresponds  to  the  values  computed 
in  steps  (3.1)  to  (3.6)  using  the  original  observations, 
and  j=l,2,...  corresponds  to  the  procedures  in  steps  (3.1) 
to  (3.6)  however  using  the  values 

x(j)  (1) ,... ,x(j) («)  in  step  (3.13). 
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Note  that  in  step  (3,9)  x  is  always  composed  of  the 
original  observations. 

The  iteration  is  continued  until  the  maximum  value 
of  the  absolute  values  of  the  differences  between  suc¬ 
cessive  iterates  is  less  than  a  specified  small  value. 

The  final  iterated  value  x^  is  the  m.d.i.  estimate 
x*  and  2I(x*:x)  is  computed  and  is  asymptotically  a 
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To  illustrate  the  single  sample  algorithm  let  us 
consider  a  2  x  2  contingency  table  discussed  by  R.  A. 
Fisher,  Statistical  Methods  for  Research  Workers, 

7th  Ed.  p.  314  and  also  considered  in 

Ireland  and  Kullback,  (1968b) 

* 


f 

I 


I 


3 

I 

I 


t 


using  a  different  algorithm,  viz,  adjustments  of  the  marginals. 

The  2x2  contingency  table  gives  seedling  counts 
on  self-fertilised  heterozygotes  for  two  factors  in 
maize,  Starch  v.  Sugary  and  Green  v.  White  base  leaf. 

Table 


Green 

White 

Starchy 

■ 

Sugary 

32 

936 

2901 

938 

3839 

In  accordance  with  genetic  theory,  the  marginals 
should  occur  in  the  ratio  3  to  1  and  it  is  desired  to 
calculate  an  estimate  consistent  with  the  genetic  theory 
and  test  whether  the  observed  values  are  consistent 
therewith. 

The  C  matrix  and  e  are 


11 

1 

12 

2 

21 

3 

22 

4 

e 

1 

1 

1 

1 

l 

1 

1 

0 

0 

.75 

l 

0 

1 

0 

.75 
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D 


— X 


/l997  0  0  o\ 

0  906  0  0  , 

0  0  904  0 

\  0  0  Q  32  / 


in  x  =  1 , 599401  \ 

6.809039 
6.806829 
\ 3  465736  / 


C  x  =  N*  =  /I  1  1  1\  /1997 

1100  906 

N  =  3839  \1  0  1  0/  904 

\  32/ 

/3839  2903  290l\ 

s  =  C  D  C'  =  2903  2903  1997  , 

x  '2901  1997  2901/ 


(3839 
2903 
2901 


9 


/2903  199  7\ 

iv1997  290V 


/ 2903 \  1  (2903  2901) 

\290iy  ISIS' 


/2903  1997\  /2195.2094  2193.697l\ 

\1997  290V  “  \2193. 6971  2192.1857/ 


'  707.7906  -196.697V 

rl96 . 6971  708.8143/  ' 


d  =  N£-N£ 


/2879.25\  / 2  9  0  3\ 
\2879 . 25/  “  \2901/ 


/-23.75 
\-21 . 75 


) 


t 


S"1  =  -L_ 

-22.1  Det 


/  708.8143 
\+196 .6971 


+196. 697.1  \ 

,  Det  =  463002.3496 

707.7906 / 
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■(: 


00153 

00Q42 


.  00042  \ 
.(10153/  ' 


CO) 


■  C 


'.00153 

00042 


, Q0Q42 
, QQ153 


\  /-23 , 75  \  /«-,  0454725X 

)  U21.75/  ~  U. 0432525/ 


Ctau)(1)  =  f-*0454725\ 
L— J  0432525/ 


£2  (tau) 


(1) 


-.0454725' 

-.0432525 


,088725 
,0454725 
,  04  32525i 
0 


An  y  (1)  (1) 
An  y (1)  (2) 
An  y  Cl)  (3) 
An  y(1)  (4) 


7.599401 

6.309039 

6.806829 

3.465736 


.088725 

.0454725 

.0432525 

0 


7.510676  , 
6.763567  , 
6.763576  , 
3.465736  , 


y(1)  Cl)  =  1827.448  , 
y(1)  (2)  =  865.733  * 


y (1)  (3)  =  865.733  » 
y(1)  (4)  =  32.000  / 


y(1)  (1)  +  . .  .+y(1)  (4)  =  3590.914  , 


lU)  -  5CTOn  =  0-066805  , 


An 

x(1> 

(1) 

=  0.066805 

+ 

7.510676  = 

7.577481, 

x'1’ 

(1)  =  1953.701 

An 

xll> 

(2) 

=  0.066805 

+ 

6.763567  = 

6 . 8303“,2 , 

x(1> 

(2)  =  925.535 

An 

x(1) 

C3) 

-  0.066805 

+ 

6.763576  = 

6.830381, 

x(1> 

(3)  =  925.543 

An 

x(1> 

(4) 

=  0.066805 

+ 

3.465736  = 

3.532541, 

x(1) 

(4)  =  34.211 

X2  =  (-23 . 75 ,  -21.75)  =  2*021  * 
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Retaining  two  decimal  places  we  t^ke 


x*Ul) , 

x*  (12 )  ,  x*(l.) 
x*  (21)  ,  x*  (.1) 
x*  (22) . 


2879.25, 

2879.25, 


3839.00 


Since  =  (q)  , 

,  t(1)  =  S_2 2  1 1 ^  dU)  =  (0), 

and  there  will  be  no  change  in  the  estimate 
by  further  iteration. 


2l(x*:x)  =  2(1953. 7Un  ■  +  925.54*n  9g^-5— 


mr  cAn  925.54  ,  -,A  34.21x 

+  925.54An  — f  34.21£n  -jj — ) 


2  (-42.8174  +  19.7492  +  21.7946  +  2.2846) 

2(1.011)  =  2.022  . 


> 
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4.  k-samples 

The  extension  of  the  previous  single  sample  discussion 
to  the  case  of  k-samples  makes  use  of  an  approach  due  to 
Gokhale  (1973). 

Consider  the  k  discrete  spaces  ,  i=l,2,...,k, 
where  we  designate  the  "points"  or  "cells"  of  by 
t^Cj)  ,  j=l,2,...f^i  =  (Wi  (1)  , . . .  ,  (J2i) )  .  We  use 

to  represent  both  the  space  and  the  number  of  "cells" 
in  it.  Let  pi  =  (p.^  ('uk  (1) )  , . .  .  ,pi  (Ok  (fL) ) )  ,  i=l,...,k, 
be  k  sets  of  probability  distributions  defined  respective¬ 
ly  over  ft.,  i=l,2,...,k.  Let  £'  =  (j5^,...,£k)  be  a  1  x  ft 
matrix,  where  ft  =  fil+^2+ '  '  *  +^k ‘  Let  ^  ^he  collecti°n 
of  all  such  matrices  (vectors)  £.  For  a  given 
Tj_'  =  , . .  .  ,7^)  eP  and  £eP  the  generalized  discrimination 

information  is  given  by 

n. 

i 

(4.1)  l(£:;n_)  =  I  wi  I  P-j  (wi  (j)  )  )6n(pi  (a>i  ( j)Ai  ( j) )  ), 

i  j  =  l 

where  the  constants  w^  are  known  and  are  such  that 
^  w^  =  1,  0<w^<;l.  Let  us  denote  the  elements  ("points" 

or  "cells"l  of  ft  =  •  •  •+njc  by  w(ij)  ,i=l, . . .  ,k,  j=l , . . . 

so  that  u)(il)  , . . .  ,w(i fi^)  are  the  components  of  0,  belonging 
to  fK  . 
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The  minimum  discrimination  information  estimate 
is  the  value  of  £  which  minimizes  the  generalized  dis¬ 
crimination  information  in  (4.11  over  the  family  of 
£'s  which  satisfy  the  restraints 

(4.*)  B  E  =  0  , 


where  B  is  (k+r)xft,  £  is  fixl,  £  is  (k+r)xl  and  the  rank 
of  B  is  k+r<fi.  We  shall  now  transform  the  problem  to  a 
canonical  form  similar  to  that  of  the  single  sample  case. 
Let 


(4.3) 


ML  be  an  fi.xJh  diagonal  matrix  with  diagonal 
elements  w^, 


and 

(4.4) 

(4.5) 

(4.6) 

(4.7) 


W  = 


P  = 

n  = 

c  = 


/  ^  0  ...  0 
0  W2.  .  .  0 

\  0  0  .  .  . 

W  £,  P'=(P(u>(ll)  ,.  .  .  , .  .  .  ,P  (U)  (kl) )  ,.  .  . 

w  it,  n*  =  (n(o)(ii)  r...,n(co(kfik) )), 

3  w”1,  C  is  (k+r) xft ,  W-1  =  V  . 


We  note  that 

n. 


n. 


(4.8)  ^P(w)  =  l  w.p,  (w.  (j) )  +  . . .+  I  wvpk(u>.  (j)) 

a  j=i  1  1  L  j=i  K  K 


=  W^  +. . .+  Wk  =  1  , 


?.12 


,P(to(k,fik))) 
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(4.9}  jj  JUw)  =  ^  vlVwl(j ))  +  **,+ 


,  ,twk  5*  1  , 


*  "i  w  p  Cw.tj)) 

(4.10)  I(E:i)  =  ^  »iPi(«i«W»  w^tu^J)) 

-  Jp(“Un  ref  *  I(PiI,)' 

*■ 

C4.ll)  B  £  =  B  w-1  W  E  =  c  P  =  e  . 

In  terms  of  the  canonical  transformation  the 
k-sample  problem  may  now  be  formulated  as  finding  the 
m.d.i.  estimate  P*(w)  minimizing 

C4.12)  I (P : II}  =  J  P  (to)  Jin  , 

subject  to 

(4.13)  C  P  =  0  , 

where  C  is  (k+r)xfl,  P  is  ftxl,  £  is  (k+r)xl  and  the  rank 
of  C  is  k+r<fi.  Paralleling  the  discussion  of  the  single 
sample  case,  with  appropriate  modifications,  we  denote 
the  elements  of  the  matrix  C  by  C^(w),  i=l, . . . ,k,k+l, . . . ,k+r, 
<*>=11, . . .  ,lfl^, . . . /xl,  . . .  ,kfl^.  We  may  write  (4.13)  as 

(4.14)  y  c,.  (<*>)  P  (u>)  =  9.,  i=l, . . .  ,k,k+l, . . .  ,k+r. 

SI  A  1 

We  shall  usually  assume  b^ (w^ ( j) )=1, j»l , . . .  ,fl^,i=l, . . . ,k, 


1 

< 
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and  zero  otherwise,  that  is, 

(4.15)  ~  for  w-il, , . , fift^,  v^  -  1/w^, 

=  0  otherwise,  i=l,2,...,k, 

0^  =  1,  i=l,2  , . . . ,k. 

In  accordance  with  the  m.d.i.  theorem  we  have 
P*  (u) ) 

14  • 161  in  -irtef"  *lcl^  +  * '  *+*kck^  +^k+lck+l  ^w,  +  •  •  *+^k+r°k+r  ' 

03  =11  ,  .  .  .  ,kftk» 

We  now  partition  the  matrices  as  follows: 

,  where  is  kxft,  C2  is  rxft, 

,  where  1  is  a  kxl  matrix  of  l‘s,  is 

rxl  that  is,  P  =  1,  £2  P  «  £*. 

If  we  have  k  samples  corresponding  to  , 

where  the  sum  of  the  observations  in  the  i-th  sample 
is  and  N  =  Nx  +  N2+...+Nk,  then  wi  =  N^/N  , 

(4.19)  x*(u>)  =  N  P*(w) , 

(4.20)  x (oi)  =  N  n(oj), 

ftl  x.(j)  nk  x.(j) 

(4.21)  l  x(oi)  =  l  U  -i -  +...+  I  U  - 

ft  j=l  1  N1  j=l  *  Nk 

=  N1+. . .+Nk  =  N. 


(4.17) 


(4.18) 


C 

—  I  r  / 
\— 2/ 


I 

1 

i 


I 

l 
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The  minimum  discrimination  information  statistic 
is 

(4.22)  2ICjc*:x1  =  2N1U?*;H)  =  2  J  x*(w)&n  , 

2 

which  is  asymptotically  distributed  as  x  with  r  degrees 
of  freedom  if  the  observed  values  satisfy  the  hypothesis 
or  model  implied  by  (4.2)  or  (4.13).  If  we  set 

(4.23)  CN_n  =  Cx  =  N  <p,  <fc  =  ,  where  x  is  ftxl, 

1  is  a  kxl  matrix  of  1 1 s ,  0  is  rxl, 


then  the  quadratic  approximation  to  2l(x*:;<)  is  given 

2 

by  the  minimum  modified  x  with  rD.F., 


(4.24)  X2  =  (Ne*-N0)*S22  1(N6*-N0), 


where 

(4.25)  S 


■5.] 

s. 


C'  =| 

(£i2*£i 

£iExy 

U2^i 

C0D  Cl 
— 2— x— 2 

—12 

) 

-22 

)' 

:  k, 

—21  =  —2! 

is  k  x 

-22 


is  r  x  r 


An  elementary  example  illustrating  the  21  (x  :x) 
quadratic  approximation  using  the  several  sample 
approach. 

Suppose  we  have  observed  two  binomial  samples 

x(ll),  x(12)  ,  x(ll)  +  x(12)  =  Nx  , 

x (21)  ,  x (22) ,  x(21)  +  x (22)  =  N2  , 

and  we  want  to  te3t  the  null  hypothesis  that  p(ll)  = 
The  set  up  corresponding  to  Bp  =  6  is 


11 

12 

21 

22 

1 

2 

3 

4 

e 

1 

1 

0 

0 

l 

0 

1 

1 

l 

1 

0 

-1 

0 

0 

Using  vx  =  i  =  N/N^  V2  «  £  -  N/N2  ,  N 


=  Ni  +  N2' 


the  transformation  to  CP  =  6  is 


n 

12 

21 

22 

l 

2 

3 

4 

0 

vi 

V1 

0 

0 

1 

0 

0 

V2 

V2 

1 

V1 

0 

"V2 

0 

0 

We  must  compute 


CD  C' ,  that  is 


23 


/v2(x(ll)  +  x(12) )  0 

0  (x(21)  +  x  (22) ) 

l  v2x(ll)  “V^x (21) 


v2x(ll) 

-v^x (21) 

v^x (11)  +  V2X (21) 


We  now  find 

S22<i  =  vjx(ll)  +  v2x(21) 


=  v2x(ll)  +  v2x(2D 


v2x2(ll) 


2  x*  (21) 


■=  vjxdl)  (l  -  +  v^x(21)  (l  -  • 

But 

d  =  0  -  ( v^x (11)  -  v2x(21))» 
hence  X2  =  d's"*^  is 
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5.  An  iterative  comDuter  alaorithm  -  k-samoles 


For  convenience  (computer-wise)  we  shall  use  n^ 
for  and  n  for  ft,  that  is,  n  =  n1tn2t. .  ,+n^,  where 
n^  is  the  number  of  "cells"  in  the  i-th  sample  whose 
total  number  of  observations  is  N^. 


(5.1)  C  P  =  0,  C  =L  ,  C,  is  k  x  n,  C,  is  r  x  n, 
“/i"\  "  {~2' 


£  =  e J  ,  1  is  a  k  x  1  matrix  of  ones,  0.*  is  r  x  1, 


(5.2)  C  x  =  N  <P_,  $  « 

A 

0  is  r  x  1 p 


,  1  is  a  k  x  1  matrix  of  ones. 


(5.3)  D  is  n  x  n  diagonal  matrix  of  observations, 

— X 


(5.4)  S  -  C  Dx  C'  = 


(5.5)  S^2.l  ~  -22  "  -21-11-12  ' 


-n 

c 

-21 

§22/ 

'  -11 

—22 

-21  "  -12 


(5.6)  a  =  N^-N^  =^dj,0isakxl  matrix  of  zeros, 
d  *  NQ_*-Ni0  is  r  x  1  , 

(5.7)  t(j)  =  £(j)  j=0 ,1,2,... 

Let  jjx  y  denote  an  n  x  1  matrix  and  &n  x  the  n  x  1 
matrix  of  £,n  x(l),...,&n  x(n), 

(5.8)  (£ay)  ^+1)  =  (_tau)  ^Ut(^  ,  (tau)  ^  =0  for  j=0, 

j=0 ,1,2,...  , 
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(5.9)  An  «  &n  x  +  C^(tau)^1,  j-1,2,...  , 

(5.10)  y^Ul,,.,,*^1  Cn)  ,  j*l,2f...  , 


rs 


C5.ll)  < 


pH  * 


.Cjl. 


Cwl 


for  oj  the  values  in  the  first  set. 


ls£»  =  I  ((»)]  for  oj  the  n^  values  in  the  k—  set, 

k 

N 

(5.12)  vh  L^j)  =  =  An  -yjy  ,  h=l ,  2  , . . .  ,  k, 

Sh 

(5.13)  An  x^  (u)  =  M^^+An  y^  (w)  ,  for  go  in  set  h-l,2,...,k, 

j*5! » 2  , . . .  , 


(5.14)  xlj)  (1) /...  ,x(j)  (n)  ,  j-1,2,... 

In  step  (5.7),  j=0  corresponds  to  the  values  computed 
in  steps  (5.1)  to  (5.6)  using  x  and  j=l,2,...  corresponds 
to  the  procedures  in  steps  (5.1)  to  (5.6)  however  using 
the  values  x^  (1)  , . . .  ,x^  (n)  in  step  (5.14).  Note  that 
in  step  (5.9)  _An  x  is  always  composed  of  the  initial 
values  x. 

The  iteration  is  continued  until  the  maximum  value 
of  the  absolute  values  of  the  differences  between  successive 
iterates  is  less  than  a  specified  small  value. 

The  final  iterated  value  x^  is  the  m.d.i.  estimate 

x*  and  2l(x*:x)  is  computed  with  r  degrees  of  freedom. 

2  2 

If  the  min.  rood,  x  estimates  and  the  min.  mod.  x 
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value  are  desired  the  program  continues  and  computes  , 

/s12  d 

C5.151  1  =  CC  ^  C'L^A,  -  s‘l  4  =  (  "_x  “  ,  , 

22 , 1 

(5.16}  E  =  C'  X  =  C'  (C  Dx  C')"1  A  , 

(5.17)  X  =  X  +  D  y  =  x  +  D  C '  (C  Dv  C ' ) _1  A  , 

Is11  s12 

(5.18)  X2  =  A '  X  =  A '  (C  Dx  C')"1  A  =  (0,d')  “21 

\-  — 22 . 


*  i'  —22.1  *■ 


~  2 
The  x  in  (5.17)  are  the  minimum  modified  x  estimates 

2 

and  X  in  (5.18)  is  the  value  of  the  minimum  modified 
2 

X  with  r  degrees  of  freedom  and  is  the  quadratic  approx- 

2 

imation  to  2l(x*:x).  Note  that  X  in  (5.18)  can  be 
computed  without  getting  x. 
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6 .  Computer  Programs 


A  basic  program  using  the  marginal  fitting  technique 
was  prepared  by  Professor  C.  T.  Ireland  of  The  George 
Washington  University.  The  current  version  in  The  George 
Washington  University  Computer  Center  is  CONTAB  III. 

A  modification  of  CONTAB  was  prepared  by  Marian  Fisher. 
This  program  is  in  The  George  Washington  University  Computer 
Center  as  CONTABMOD.  It  provides  as  output,  in  addition 
to  the  estimates  and  their  logarithms,  the  design  matrices, 
values  of  the  taus,  and  the  covariance  matrix  of  the  taus. 

The  following  programs  are  applicable  to  problems  as 
described  in  the  preceding  chapter,  as  well  as  the  "smooth¬ 
ing"  or  fitting  problems.  These  programs  were  compiled  by 
John  C.  Keegel  and  are  in  The  George  Washington  University 
Computer  Center. 

For  its  interest  we  first  illustrate  the  marginal 
fitting  algorithm  for  the  two-way  marginals  of  a  three-way 


table. 


2 


1.  Iteration,  marginal  fitting  algorithm. 

Tne  values  of  tne  p* -table  can  be  computed  by  an  iterative 
scneme  wnicn  adjusts  tne  tt -table  to  satisfy  successively  tne  given 
marginal  restraints.  For  a  tnree-way  table  wnen  all  two-way  marginals 
p(ij.)»  p(i.K),  p(.JK)  are  given,  tne  iteration  cycles  tnrougn 


( 3m+l  ) 

P(iJK) 


p(ij.) 


CD 


(  »■  +  a) 
P(ijK) 


p(i.K) 

TaT+Tl 

p(i.K) 


( 3m+l ) 

P(ijK) 


l3,+3)  n  I  ii^  (**+a) 

p(ijK)  -  £ilLl  p(ijK)  ,  n-0,1,... 

( 3m+a) 

P(-JK) 

(  O) 

wnere  p(ijic)  may  be  1/rcd  or  p*(ijK).  For  a  four-way  table  wnen 
all  tnree-way  marginals  p(ijK.)  f  p(ij.t),  p(i.Kt),  p(.jtc^)  are 
given  tne  iteration  cycles  tnrougn 
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(4  n  +  1  ) 

P(iJKt) 


(4  0+3) 

P(iJKt) 


(2) 

(4  n  +  3  ) 

P(iJKt) 


P(IJK-) 

(4  a) 

p(IjK.) 


p(ij--t)  <4“  +  1> 
P(iJ-t) 


P  ( 1 .  ) 

(4  a  +3  ) 

p(i.K l) 


(4  a  +  3) 

P(ijK£) 


(40+4) 

P(iJKt) 


p( .  jgp 

(48+3) 

p(  .JKt) 


(°) 

wnere  p(ijK^)  may  dp  1/rstu  or  p*(ijKt)  or  p*(ijKt).  It  can  be 
snown  tnat  tne  iteration  converges  to  p*  and  p*  is  unique 


Altnougn  tne  above  iteration  nas  been  in  terms  of 

probabilities,  in 

practice  it  nas  been  found  more  convenient  not  to  divide  everything 
by  n  and  tne  iterations  are  carried  out  using 

observed  or  estimated  occurrences  m  (i<3Kt)-n/rstu,x(i. .  . )  ,x(Ij  . . ) , 
etc.,  x*  (ijict)  «np*  (Ijk^)  ,  and  In  fact  our  subsequent  discussions 
will  be  In  terms  of  observed  or  estimated  occurrences.  In  certain 
cases  wnen  tne  estimates  can  be  given  explicitly  In  terms  of  specified 
marginals  tne  iteration  Is  completed  after  tne  first  cycle,  for 
example,  given  tne  observed  one-way  marginals  xf(ljKt)** 


x(i. . . ) x ( . j . .)x( . .k.)x( . . .  t ) /n3 . 


Usually  5  to  7  cycles  nave  been  found  to  be  sufficient  to  obtain 
agreement  between  marginals  to  witnin  0.001  wnen  more  tnan  one  cycle 
is  required. 

It  may  be  nelpful  to  elaborate  somewnat  tne  iterative  algoritnm 

given  in  (l)  in  terms  of  occurrences  as  follows: 

(  °) 

1.  Start  witn  x(Ijk)  -  n/r.c.d. 

(  o) 

2.  Compute  tne  marginals  oc(Ij.). 
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(o) 

3.  Adjust  x(3Ljk)  Dy  tne  ratios  of  tne  oDserved  marginals  x(ij.) 

( ° )  ( i ) 

to  computed  marginals  x(ij.).  Tne  adjusted  entries  are  x(ijtc) 

U) 

4.  Compute  tne  marginals  x(i.K) . 

a) 

5.  Adjust  x(ijK)  Dy  tne  ratios  of  tne  otserved  marginals  x(i.K) 

(i) 

to  tne  computed  marginals  x(i.K).  Tne  adjusted  entries  are 
(2) 

x(ijn)  . 

(2) 

6.  Compute  tne  marginals  x(.jk). 

(2) 

7.  Adjust  x(1jk)  Dy  tne  ratios  of  tne  oDserved  marginals  x(.jk) 

(a) 

to  tne  computed  marginals  x(.jic).  Tne  adjusted  entries  are 

(3) 

x(ijK)  and  one  cycle  is  completed. 

8.  Continue  tne  procedure  from  steps  (2)  tnrougn  (7)  anove  usinj 

(3) 

x(ijK)  as  tne  starting  entries. 

9.  Continue  tne  process  until  tne  tnree  sets  of  oDserved  marginal 
agree  to  witnin  tne  specified  tolerance. 

We  snail  illustrate  tne  iterative  algoritnm  (l)  witn  Cocnran's  data 
(1954)  for  tne  2x2x3  TaDle  1. 


TABLE  1 


Data  on  numoer  of  motners  witn  previous  infant  losses 


jirtn  Order 


NumDer  of  motners  witn 
losses  no  losses 


ProDlem 

x 

20 

X 

;ia; 

82 

d 

Control 

X 

tan! 

- 

10 

X 

[221] 

- 

54 

3-4 

ProDlem 

X 

;n2; 

B 

26 

X 

[122] 

C3 

41 

Control 

X 

[212] 

- 

16 

X 

[222] 

30 

5+ 

ProDlem 

X 

;n3] 

27 

X 

[125] 

B 

22 

Control! 

X 

law) 

- 

14 

X 

[223] 

B 

23 

Tne  sets  of  oDserved  marginals  are 

l‘Q  107  136  71  45  64  46  37 

We  snail  find  tne  values  of  x*(ijK)  fitting  tnese  marginals. 
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(0) 

Using  x(ijk)  =  365/(2  x  2  xv3)  =  30.416  the  sequence  of 

values  in  Table  2  is  obtained.  After  the  first  cycle,  the 

O) 

"resemblance"  between  x(ijk.)  and  the  final  values  xf(ijk)  is 
already  evident,  and  the  tolerance  requirement  of  0.001  is  met 
after  5  cycles. 
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2 .  KULLITR  2 

KULLITR  2  is  the  computer  program  that  performs  the  steps 
and  procedures  described  in  Chapter  5,  sections 

4.  k  -  samples,  5.  An  iterative  computer  algorithm  - 
k  -  samples. 

The  program  is  flexible  and  can  accomodate  a  variety  of 
experimental  situations.  In  some  problems  the  value  of  N0^  may 
be  determined  from  some  known  distribution  x  by  N|  =  C  x.  In 
such  cases  it  is  not  necessary  to  supply  N£  but  furnish  x  and 
the  program  computes  N0_  =  C  x.  For  k  -  samples  it  is  not 
necessary  for  the  analyst  to  compute  the  appropriate  weights  c 
and  the  matrix  W,  since  if  the  user  provides  the  B  matrix  the 
program  computes  C  =  B  W  1.  Of  course  it  the  user  desires  to 
use  arbitrary  weights  not  relc  jd  to  the  sample  sizes  one  may 
have  to  supply  the  C  matrix  since  in  such  cases  the  program 
cannot  compute  it.  In  those  cases  where  Nj3  is  provided  by 
"external"  hypotbesep  the  program  will  also  compute  the  minimum 
modified  chi-squared  estimates  unless  the  user  specifies  other¬ 
wise.  By  properly  setting  appropriate  parameters,  in  the  case 
of  complete  contingency  ti/Les,  cells  will  be  coded  lexicograph¬ 
ically  as  in  other  prc^~  rr.-.  or  contingency  table  analysis. 


The  information  that  must  be  supplied  to  the  program  is 
divided  into  three  segments : 

(1)  Parameters 

(2)  Factor  names 

(3)  Table  data  and  constraints 

The  parameter  list  (1)  must  be  followed  by  ;  .  The  factor 
names  (2)  must  be  followed  by  ;  . 

For  segments  (1)  and  (2)  the 
parameter  name  followed  by  =  followed  by  the  parameter  value 
must  be  punched  on  the  cards.  The  parameters  must  be  separated 
by  a  blank;  however  the  order  of  punching  the  parameters  within 
segment  (1)  is  not  important.  In  segment  (3) ,  only  numerical 
values  are  punched,  and  the  numbers  must  be  separated  by  blanks. 
Observed  values  of  zero  are  punched  as  0  but  the  program  treats 
them  automatically  as  0.000001. 

JCL  Instruction 

1.  //  Standard  Job  Card 

2.  //  EXEC  PL1XG,  DSN  =  ' U.ST6630 . IRELAND;  PROG=KULLITR2 

3.  //GO. PUNCH  DD  SYSOUT=B,  DCB= (RECFM=F ,  BLKSIZE=80) 

4.  //GO. SYS IN  DD* 

(1)  Parameters 

(2)  Factor  names 

(3)  Table  information 

5.  /* 
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The  cards  numbered  2,3 ,4 ,5  above  make  up  the  EXEC  program. 
Card  3  is  necessary  only  if  punched  output  is  desired  and  may 
otherwise  be  omitted.  Card  5  followr  the  parameters,  factor 
names  and  data  and  indicates  the  end  of  the  run.  If  several 
jobs  are  to  be  run,  the  parameters  ,  factor  names  and  table 
information  for  each  may  be  separated  by  a  blank  card  and  card 
5  of  the  EXEC  program  placed  at  the  very  end. 
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PARAMETER 

DEFAULT 

EFFECT 

NUMSET  *  k 

1 

The  number  k  is  the  number  of 

samples  in  the  k  -  sample  problem. 

INTERNAL  =  '  'B 

•l'B 

'l'B  causes  N£  to  be  calculated 

as  C  x  from  a  user  supplied 

distribution  x.  'O'B  implies 

that  N0  will  be  supplied. 

MATDIF  =  '  'B 

'O'B 

' l'B  implies  that  ill  conditioned 

matrices  appear  and  inverts  with 

special  procedures.  'O'B  uses 

standard  procedures  and  will 

apply  in  most  cases. 

TOL  1  = 

.01 

TOL  1  is  the  maximum  absolute 

/s 

difference  allowed  for  N0-N0  for 

the  first  k  constraints  in  a 

k  -  sample  problem.  The  tolerance 

value  should  not  involve  more  than 

6  digits. 

TOL  2  = 

.01 

TOL  2  is  the  maximum  absolute 

difference  allowed  for  the  last  r 

A 

components  of  N0-N0.  (See  TOL  1) 

TOPCOUNT  = 

15 

If  the  program  does  not  converge 

i 

(satisfy  TOL  1  and  TOL  2)  after 

the  number  of  iterations  specified 
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by  TOPCOUNT,  the  tolerances  are 
relaxed  by  moving  the  offending 
tolerance  one  decimal  place  to  the 
left  in  steps  of  5  iterations. 

If  BMAT  =  ' l'B  the  program  expects 
only  the  B  matrix  to  be  supplied 
and  will  compute  C  =  B  W  ^  If 
BMAT  =  'O'B  the  C  matrix  must  be 
supplied. 

If  AOK  =  ' l'B  the  program  computes 
the  minimum  modified  chi-squared 
estimate.  In  this  case  INTERNAL  = 
'O'B.  AOK  =  'O'B  suppresses  the 
minimum  modified  chi-squared  esti¬ 
mate.  Should  be  used  if  the  matrix 
S  -CD  C'  will  cause  problems 

^  — X  " 

in  the  attempt  to  invert  it. 

This  parameter  applies  only  when 

INTERNAL  =  'l'B.  If  UNIF  =  'l'B, 

the  initial  distribution  in  the 

iteration  will  be  the  uniform 

distribution  and  need  not  be 
supplied,  the  program  computes 

it.  If  UNIF-'O'B  the  initial 

distribution  for  the  iteration 

must  be  supplied. 


PARAMETER 
CONDIF  *  '  'B 


LISTS  =  '  '  B 


FIRSTEST  =  ' 


DEFAULT 


EFFECT 


'O'B 


CONDIF  =  'l'B  is  used  if  there 
will  be  difficulty  in  convergence 
particularly  when  initial  distri¬ 
bution  is  uniform  and  table  is 
large  or  cell  entries  have  a 
wide  range.  Make  TOPCOUNT 
large  if  used. 


'O'B 


'l'B  lifwS  the  S^  matrix  'O'B 
suppresses  the  listing  of  the 
S  matrix. 


B 


'l'B 


'O'B  suppresses  listing  first 
estimate,  'l'B  lists  the  first 
estimate. 


THE  PARAMETER  LIST  MUST  BE  FOLLOWED  BY  ; 
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(2)  Factor  names 

This  segment  is  used  only  if  FACTORS  >  1.  Each  factor  name 
in  '  'is  preceded  by  FACNAME  (f)  «  where  f  is  the  factor 
number.  For  example ,  for  a  2x2x2  table  where  the  first  factor  is 
time,  the  second  factor  is  cutting  and  the  third  factor  is 
mortality  we  have 
FACNAME  (1)  =■  'TIME' 

FACNAME  (2)  =  'CUTTING' 

FACNAME  (3)  =  'MORTALITY' 

This  segment  is  optional  and  if  used  must  terminate  with  ;  . 
If  not  used  ;  must  still  be  supplied  only  if  FACTORS  >  1. 

(3)  Table  data  and  constraints 

In  this  segment  only  the  numerical  values  must  be  supplied  follow¬ 
ing  the  indicated  sequence. 

Levels.  If  FACTORS  >  1  and  we  have  a  5x6x2  contingency 
table  then  the  numbers  5  6  2  are  punched.  If  we  had  a  4x3x2x2x2 
contingency  table  then  the  numbers  43222  are  punched.  If 
we  had  a  12x2x2  contingency  table  then  the  numbers  12  2  2  ate 
punched.  If  FACTORS  -  1  no  values  are  punched. 

PARTITION  NUMBERS.  If  NUMSET  >  1,  that  is,  k  -  samples, 
then  the  number  of  distinct  observations  or  cells  in  each  set 
must  appear.  These  will  add  to  the  number  of  columns  of  the 
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C  matrix.  For  example  if  NUMSET  =  3  with  16  observations  in 
sample  1,  4  observations  in  sample  2  and  4  observations  in  sample 
3  than  the  numbers  16  4  4  are  punched.  (The  C  matrix  has  24 
columns) .  If  NUMSET  =  4  with  two  observations  in  each  set  then 
the  numbers  2222  are  punched  (the  C  matrix  has  8  columns) . 

The  B  or  C  matrix  by  rows.  The  B  matrix  if  BMAT  =  'l'B, 
anu  the  C  matrix  if  BMAT  =  'O'B. 

The  observed  values  must  be  punched  in  lexicographic  order 
corresponding  to  the  columns  of  the  C  matrix.  Observed  values 
of  zero  are  punched  as  0  but  the  program  au-omatically  treats 
them  as  0.000001. 

N£.  This  is  supplied  only  if  INTERNAL  =  'O'B.  The  number 
of  values  must  be  the  same  as  CNSTRNT  =  m,  that  is,  the  number 
of  rows  of  the  C  matrix. 

The  initial  distribution  for  the  iteration.  To  be  supplied 
only  if  INTERNAL  =  'l'B  and  UNIF  =  'O'B. 


235 


15 


Remarks.  In  the  cases  when  INTERNAL  =  1 0'B,  the  output 
2 

includes  X  the  minimum  modified  chi-squared  value  (the  quadratic 

approximation  to  2l(x*:x))  and  2l(x*:x)  where  x*  is  the  minimum 

discrimination  information  estimate  and  x  the  observed  values. 

2 

Both  X  and  2l(x*:x)  are  asymptotically  distributed  as  chi -squared 
with  r  =  m-k  degrees  of  freedom. 

2 

In  the  cases  when  INTERNAL  =  'l'B,  the  output  includes  X  , 

the  chi-squared  approximation  to  2l{x*:x)  where  now  x  is  the 

initial  distribution  of  the  iteration,  and  also  2l(Z:x*)  where 

2 

Z  is  the  observed  distribution.  The  degrees  of  freedom  for  X 
and  2l(x*:x)  are  (m-k)-(m'-k)  =  m-m'  where  the  C  matrix  for  the 
determination  of  the  initial  distribution  is  m'  x  n.  The  degrees 
of  freedom  for  2l(Z:x*)  are  n-m  where  the  C  matrix  is  m  x  n.  In 
this  case  we  also  have  the  analysis  of  information  relation 
2l(Z:x)  =  2l(x*:x)  +  2l(Z:x*) 
n-m'  m-m'  n-m 

with  the  associated  degrees  of  freedom.  The  use  of  x  for  the 
initial  and  Z  for  the  observed  distribution  should  cause  no 
difficulty  in  this  case  as  the  output  specifies  "Z  IS  OBSERVED 
TABLE  AND  X  IS  INITIAL  DIST." 
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3 .  DAARAT 

The  generalized  iterative  scaling  procedure  described  by 
J.  N.  Darroch  and  D.  Ratcliff  (1972),  Generalized  iterative 
scaling  for  log-linear  models,  Annals  Math.  Statist.  43,  No.  5, 
1470-1480,  extends  the  Deming-Stephan  algorithm  to  cases  in 
which  the  "design  matrix"  does  not  consist  only  of  zeros  and 
ones.  A  discussion  of  the  procedure  and  the  proof  of  the 
convergence  of  the  iteration  are  to  be  found  in  the  cited  re¬ 
ference.  We  shall  present  an  exposition  of  the  iteration  and 
a  user's  guide  to  the  related  computer  program  DARRAT  similar 
to  that  for  KULLITR  2.  The  basic  concepts  discussed  for  the 
analysis  of  k-samples  are  applicable  here  too.  The  basic 
difference  with  KULLITR  2  is  the  iterative  algorithm  used. 

For  convenience  as  a  frame  of  reference  we  give  the  genera¬ 
lized  iterative  scaling  algorithm  as  given  by  Darroch  and  Ratcliff. 

Let  I  be  a  finite  set  and  let  p  *  [p(i)  ;iel,p(i)  >0,  T.  p(i)=l] 

iel 

be  a  probability  function  on  I.  Suppose  that  p  is  a  member  of  a 
family  of  distributions  satisfying  the  constraints. 

Z  bsip(i)  "  ks's  =  1'2--*d  1  P(i)  35  1  (1) 

iel  S1  s  iel 

where  for  all  s  there  exist  iel  such  that  b  ,/0.  The  constraints 

si 

in  (1)  may  be  reformulated  into  the  equivalent  canonical  form 
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£  atiP(i)  =  hr'r  “  l,2,...c  , 
iel  11  r 


n 


£o,  Z  a  .  =  l,h  >0,  Z  h  =  1  , 


r=l 


n 


r=l 


by  defining 


aai  =  VVbsi>-  a11  i' 


h  =  t  (u  +k  ) ,  3*1, 2,..., d  , 
s  s  s  s 


i 


(2) 


(3) 


where  u  *0,  t  >0  are  chosen  to  make 
s  s 

d 

a  .>0  and  Z  a  .<1  for  all  iel. 


d 

If  Z  a  .=1  for  all  i  define  c=d,  otherwise  define  c=d+l  and 
3=1  S1 


let  aci  *  1 


d 

z 

8=1 


asi' 


h 


c 


d 

1  -  Z  h  . 
8  =  1 


Now  let  j  =  [tt (i)  ,  iel,  ir(i)>0,  Z  7r(i)<l]  be  a  subprobability 

iel 

function  on  I.  The  minimum  discrimination  information  estimate 
p*(i),  iel,, is  that  member  of  the  family  p  satisfying  the 
restraints  (2)  and  minimizing 


I(P;tt)  =  Z  p(i)  infiij 
~  ~  iel 


and  is  given  by 


in 

IT  (I) 


C 

Z  aMjT 

r«l 


ri  r  , 


(4) 


(5) 


< 
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where  the  t  are  parameters  to  be  determined  so  that  p*(i) 
satisfies  the  constraints  (2).  The  values  of  p*(i)  may  be 
determined  by  the  convergent  iteration 


We  remark  that  if  we  use  the  relations  i=u)  ,  l=ft,  k=0, 

b  .=b  (oj)  ,  then  the  constraints  in  (1)  above  are  the  same  as 

Si  s 

the  constraints  (1.2)  in  Chapter  5,  section  1  or  (4.2) 
in  Chapter  5,  section  4. 

DARRAT  is  a  computer  program  that  performs  the  steps  and 
procedures  of  the  Darroch-Ratclif f  generalized  iterative  scaling 
procedure.  The  iteration  will  converge  at  a  faster  rate  if 
instead  of  modifying  the  appropriate  design  matrix  as  a  unit 
into  the  canonical  form  as  above  the  design  matrix  is  subdivided 
into  blocks  of  related  rows  (similar  to  the  notion  of  marginals) 
and  each  block  reduced  to  the  canonical  form.  The  usev  must 
decide  which  rows  of  the  design  matrix  are  to  be  put  into  a 
common  block  and  the  program  then  converts  these  blocks  to  cano¬ 
nical  form  for  cycles  within  an  iteration.  As  in  KULLITR  2  the 
program  is  flexible  and  can  accomodate  a  variety  of  experimental 
situations.  In  some  problems  the  value  of  NJ3  may  be  determined 
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from  some  known  distribution  Z_  by  N6=CZ .  In  such  cases  it  is 
not  necessary  to  supply  N0  but  furnish  Z  and  the  program  com¬ 
putes  the  restraints  N8=CZ .  For  k-samples  it  is  not  necessary 
for  the  analyst  to  compute  the  appropriate  weights  and  the 
matrix  W,  since  if  the  user  provides  the  B  matrix  the  program 
computes  OBW  \  Of  course  if  the  user  desires  to  use  arbitrary 
weights  not  related  to  the  sample  sizes  one  may  have  to  supply 
the  C  matrix  since  in  such  cases  the  program  cannot 

compute  it.  By  properly  setting  appropriate  parameters,  in 
the  case  of  complete  contingency  tables,  cells  will  be  coded 
lexicographically  as  in  other  programs  for  contingency  table 
analysis. 

The  information  that  must  be  supplied  to  the  program  is 
divided  into  three  segments. 

(1)  Parameters 

(2)  Factor  names 

(3)  Table  data  and  constraints 

The  parameter  list  (1)  must  be  followed  by  ;  .  The  factor 

names  (2)  must  be  followed  by  ;.  Segment  (2)  is  only  used  when 

In  case  FACTORS  >  1 

the  parameter  FACTORS  is  >  l.^and  factor  names  are  not  used  the 
;  must  still  be  used.  In  case  FACTORS* 1  the  ;  must  not  be  used. 
For  segments  (1)  and  (2)  the  parameter  name  followed  by  =  fol¬ 
lowed  by  the  parameter  value  must  be  punched  on  cards.  The 
parameters  must  be  separated  by  a  blank.  However  the  order  of 
punching  the  parameters  within  segment  (1)  is  not  important. 
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In  segment  (3)  ,  only  numerical  values  are  punched,  and  the 
numbers  must  be  separated  by  blanks.  Observed  values  of  zero 
are  punched  as  0  but  the  program  treats  them  automatically 
as  0.000001. 

JCL  Instructions 

1.  //  Standard  Job  Card 

2.  //  EXEC  PL1X6,  DSN= ' U. ST6630.  IRELAND',  P ROG=  DARRAT 

3.  //  GO.  PUNCH  DD  SYSOUT  =  B ,  DCB  =  (RECFM  =  F,  BLKSIZE  =  80) 

4.  //  GO.SYSIN  DD  * 

(1)  Parameters 

(2)  Factor  names 

(3)  Table  data  and  constraints 

5.  /* 

The  cards  numbered  2,  3,  4,  5  above  make  up  the  EXEC  program. 
Card  3  is  necessary  only  if  punched  out  put  is  desired  and 
may  otherwise  be  omitted.  Card  5  follows  the  parameters, 
factor  names  and  table  data  and  constraints  and  indicates  the 
end  of  the  run.  If  several  jobs  are  to  be  run  with  one 
execution  of  DARRAT,  the  parameters,  factor  names  table  data 
and  constraints  for  each  may  be  separated  by  a  blank  card 
and  card  5  of  the  EXEC  program  placed  at  the  very  end. 
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(1)  Parameters  -  *  items  are  mandatory 


PARAMETER 

DEFAULT 

EFFECT 

TITLE  =  'NAME' 

(Title  name  must  be 

in  apostrophes) 

Identifies  the  run  by  name. 

The  RHS  must  be  in  '  ' 

*OBS  =  n 

0 

1 

The  number  of  different 

"cells" 

*CNSTRNT  =  m 

0 

All  the  constraints  imposed 

on  the  final  distribution. 

If  C  is  an  m  x  n  matrix  then 

OBS  =  n  and  CNSTRNT  =  m. 

CARDS  «  '  'B 

•O'B 

' l'B  causes  the  final  distri¬ 
bution  to  be  punched  on  cards 

and  included  as  part  of  the 

output. 

FACTORS  =  number 

1 

The  number  specifies  the 

dimensions  of  a  contingency 

table  and  causes  the  cells 

to  be  coded  lexicographically 
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PARAMETER 


DEFAULT 


EFFECT 


PARAMETER 


DEFAULT 


EFFECT 


24 


PARAMETER 

DEFAULT 

EFFECT 

BLOCKS  = 

1 

Specifies  the  number  of 

sets  of  rows  of  C  to  be  put 

into  canonical  form  for  cycl 

ing  through  the  iteration. 

THE  PARAMETER  LIST  MUST  BE  FOLLOWED  BY  ; 


(2)  Factor  names. 

This  segment  is  used  only  if  FACTORS  >  1.  Each  factor  name 
in  '  'is  preceded  by  FACNAME  (f )  =  where  f  is  the  factor  number. 
For  example,  for  a  2x2x2  table  where  the  first  factor  is  time, 
the  second  factor  is  cutting,  and  the  third  factor  is  mortality, 
we  have 

FACNAME (1)  -  'TIME' 

FACNAME (2)  =  'CUTTING' 

FACNAME (3)  =  'MORTALITY' 

This  segment  is  optional,  and  if  used  must  terminate  with  ;  If 
factor  names  are  not  used  and  FACTORS  >  1  ;  must  still  be  supplied. 

(3)  Table  data  and  constraints 

In  this  segment  only  the  numerical  values  must  be  supplied 
following  the  indicated  sequence. 

a)  Levels  If  FACTORS  >1  and  we  have  a  5x6x2  contingency  table,  for 
example,  then  the  numbers  5  6  2  are  punched.  If  we  had  a  4x3x2x2x2 
contingency  table,  for  example,  then  the  numbers  43222  are 
punched.  If  we  had  a  12x2x2  contingency  table,  for  example,  then 
the  numbers  12  2  2  are  punched.  If  FACTORS  =  1,  no  values  are 


punched. 
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b)  BLOCK  numbers  omit  if  BLOCKS  =  1.  The  matrix  C  is  divided 
into  a  number  of  sets  of  rows  specified  by  the  parameter  BLOCKS 
in  segment  (1)  .  The  number  of  rows  of  C  in  each  set  (or  block) 
roust  be  specified.  These  numbers  must  add  to  the  number  of 
rows  in  C  (the  value  of  CNSTRNT) .  For  example  if 

1111 
C  =  1  0  1  0 

110  0 

we  might  specify  BLOCKS  =  3,  treating  each  row  as  a  unit  and 
punch  1  1  1.  There  will  be  three  cycles  in  the  iteration. 

For  example  if 

11000000 
B  =  0  0  1  1  0  0  0  0 

00001100 
00000011 
10-10-1010 

we  would  specify  BLOCKS  3  2,  treating  the  first  four  normalizing 
restraints  as  one  block  and  the  last  row  as  another  block  and 
punch  4  1.  For  example  if 

110000  00 
B  *  0  0  11  0  0  0  0 

000011  00 
000000  1  1 

101010  10 

001020  30 


246 


26 


we  would  specify  BLOCKS  =  3,  treating  the  first  four  normalizing 
restraints  as  one  block  and  each  of  the  fifth  and  sixth  rows 
as  other  blocks  and  we  punch  4  1  1.  The  iteration  would 

proceed  through  three  cycles. 

c)  Partition  numbers.  If  NUMSET  >  1 ,  that  is,  k-samples, 

then  the  number  of  distinct  observations  or  cells  in  each  set 
must  appear.  These  will  add  to  the  number  of  columns  of  the 
C  matrix.  For  example  if  NUMSET  =  3  with  16  observations  in 
sample  1,  4  observations  in  sample  2  and  4  observations  in  sample 
3  then  the  numbers  16  4  4  are  punched.  (The  C  matrix  has  24 
columns)  .  If  NUMSET  ■  4  with  two  observations  in  each  set  then 
the  numbers  2  2  2  2  are  punched.  (The  C  matrix  has  8  columns) . 

d)  the  B  or  C  matrix  by  rows.  The  B  matrix  if  BMAT  =  'l'B, 
and  the  C  matrix  if  BMAT  =  'O'B. 

e)  The  observed  values  must  be  punched  in  lexicographic  order 
corresponding  to  the  columns  of  the  C  matrix.  Observed  values 
of  zero  are  punched  as  0  but  the  program  automatically  treats 
them  as  0.000001. 

f)  N_0  .  This  is  supplied  only  if  INTERNAL  =  '0'B.  The  number  of 
values  must  be  the  same  as  CNSTRNT  =  m,  that  is,  the  number  of  rows 
of  the  C  matrix. 
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g)  The  initial  distribution  for  the  iteration.  To  be  supplied 
only  if  INTERNAL  ■  'l'B  and  UNIF  =  'O'B. 


In  the  cases  when  INTERNAL  -  'O'B,  the  output  includes 
2l(x*ix)  where  x*  is  the  minimum  discrimination  information 
estimate  and  x  the  observed  values  (also  the  initial  distri¬ 
bution  of  the  iteration).  2l(x*:x)  has  r  =  m-k  D.F. 

In  the  cases  when  INTERNAL  =  'l'B,  the  output  includes 
2l(x*:x)  where  x*  is  the  minimum  discrimination  information 
estimate  and  x  is  the  initial  distribution  of  the  iteration 
and  also  2I(z:x*)  where  z  is  the  observed  distribution.  The 
degrees  of  freedom  for  2l(z:x*}  are  n-m  where  the  C  matrix  is 
m  x  n  and  the  degrees  of  freedom  for  2l(x*:x)  are  (m-k)-(m'-k)  = 
ra-m'  where  the  C  matrix  for  the  determination  of  the  initial 
distribution  is  m'xn.  In  this  case  we  also  have  the  analysis  of 
information  relation 

2l(z:x)  =  2l(x*:x)  +  2I(z:X*) 
n-m'  m-m'  n-m 

with  the  associated  degrees  of  freedom.  The  output  carries  the 
Statement  "Z  IS  OBSERVED  TABLE  AND  X  IS  INITIAL  DISTRIBUTION." 
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*3 


4 .  GOKHALE 
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GOKHALE  is  a  computer  program  that  implements  an  algorithm 
presented  by  D.V.  Gokhale  (1972),  Analysis  of  Log-linear  Models, 
Journal  Royal  Statist.  Soc.  Ser.  B.  34,  3,  371-376.  The  algorithm 
may  be  characterized  as  a  method  of  steepest  descant.  The  algo¬ 
rithm  calculates  the  minimum  discrimination  information  (MDI ) 
estimate  that  minimizes 

(1)  I  =  £pfcln  (ptAt) 

subject  to  the  restraints 

(2)  C£  =  9. 

This  is  achieved  by  examining  only  estimates  that  satisfy  the  re- 
straints  (2)  and  following  the  gradient A(l)  in  the  direction  of 
steepest  descent.  The  procedure  converges  to  the  MDI  estimate. 

The  program  is  designed  to  be  as  flexible  as  possible.  It 
accepts  either  complete  or  partial  tables  and  weights  the  design 
matrix  in  the  latter  case  if  the  user  so  indicates  (gimilar  to 
KULLITR2  and  DARRAT) .  Constraints  are  either  supplied  or  the 
program  will  calculate  them  from  a  user  supplied  distribution. 

In  the  output  are  listed  the  values  of  the  MDI  estimate,  the 
values  of  the  parameters  in  the  log-linear  model,  and  the  covari¬ 
ance  matrix  of  the  values  of  the  parameters.  By  properly  setting 
appropriate  parameters  in  the  program,  in  the  case  of  complete 
contingency  tables,  cells  will  ba  coded  lexicographically  as  in 
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other  programs  for  contingency  table  analysis. 

The  information  that  must  be  supplied  to  the  program  is  di¬ 
vided  into  three  segments : 

(1)  Parameters 

(2)  Factor  names 

(3)  Table  data  and  constraints. 

The  parameter  list  (1)  must  be  followed  by  ;.  The  factor  names  (2) 
must  be  followed  by  ;  .  Segment  (2)  is  used  only  when  the  para¬ 
meter  FACTORS  is  greater  than  1.  In  cases  FACTORS  >1  and  factor 
names  are  not  used  the  ;  must  still  be  used.  In  case  FACT0RS=1 
the  ;  must  not  be  used.  For  segments  (1)  and  (2)  the  parameter 
name  followed  by  =  followed  by  the  parameter  value  must  be  punched 
on  cards.  The  parameters  must  be  separated  by  a  blank  space. 

The  order  of  punching  the  parameters  within  segment  (1)  is  not 
important.  In  segment  (3)  only  numerical  values  are  punched,  and 
the  numbers  must  be  separated  by  blank  spaces.  Observed  values 
of  zero  are  punched  as  0  but  the  program  treats  them  automati¬ 
cally  as  0.000001. 

JCL  Instructions 

1.  //  Standard  Job  Card 

2.  //  EXEC  PL1X6,  DSN=*' U.ST66 30. IRELAND'  ,  P ROG=GOKH ALE 

3.  //GO. PUNCH  DD  SYSOUT=B ,DCB=  (RECFM=F ,BLKSIZE=80) 

4.  //  GO.SYSIN  DD  * 

(1)  Parameters 

(2)  Factor  names 

(3)  Table  data  and  constraints 

5.  /* 
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The  cards  numbered  2 ,3,4 ,5  above  make  up  the  EXEC  program. 
Card  3  is  necessary  only  if  punched  output  is  desired  and  may 
otherwise  be  omitted.  Card  5  follows  the  parameters,  factor 
names  and  the  data  and  constraints  and  indicates  the  end  of  the 
run.  If  several  jobs  are  to  be  run  with  one  execution  of  GOKHALE , 
the  parameters,  factor  names,  table  data  and  constraints  for  each 
may  be  separated  by  a  blank  card  and  card  5  of  the  EXEC  program 
placed  at  the  very  end. 


(1)  Parameters —  *  items  are  mandatory 


PARAMETER 

DEFAULT 

EFFECT 

TITLE= ' NAME  * 

(Title  name  must 
be  in  apostrophes) 

Identifies  the  run  by  name. 

The  RHS  must  be  in  1  ' . 

*OBS=n 

0 

The  number  of  different  "cells." 

*CNSTRNT=m 

0 

All  the  constraints  imposed  on  th< 
final  distribution.  If  C  is  an 
m  x  n  matrix  then  OBS=n  and 
CNSTRNT=m. 

EPZ=number 

.0001 

When  the  length  of  the  gradient 
becomes  smaller  than  EPZ,  the 
algorithm  is  deemed  to  have 
converged. 

CARDS- '  '  B 

'  0 '  B 

*1'B  causes  the  final  distri¬ 
bution  to  be  punched  on  cards  and 
included  as  part  of  the  output. 

FACTORS=number 

1 

The  number  specifies  the  dimen¬ 
sions  of  a  contingency  table  and 
causes  the  cells  to  be  coded  lexi¬ 
cographically  . 

NUMSET=k 


The  number  k  is  the  number  of  sam¬ 
ples  in  the  k-sample  problem. 
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INTERNAL3 1  'B 


TOPCOUNT- 


BMAT3 '  ' B 


UNIF= 1  'B 


MATDIF= '  'B 


'  1 '  B 


36 


'O'B 


'  1 '  B 


'O'B 


'l'B  causes  the  restraints  N0^ 
to  be  calculated  as  CZ_  from  a 
user  supplied  distribution  Z. 

'O'B  implies  that  N0_  will  be 
supplied. 

If  the  program  does  not  converge 
(satisfy  EPS)  after  the  number  of 
iterations  specified  by  TOPCOUNT, 
then  EPZ  is  multiplied  by  10. 

If  BMAT='1'B  the  program  expects 
only  the  B  matrix  to  be  supplied 
and  will  compute  the  C  matrix  by 
weighting  the  B-matrix  properly. 

If  BMAT=  '  0  ' B  tFTe  C-matrix  must 
be  supplied. 

This  parameter  applies  only  when 
INTERNAL3 ' 1 ' B .  If  UNIF='1'B,  the 
initial  distribution  in  the  itera¬ 
tion  will  be  the  uniform  distribu¬ 
tion  and  need  not  be  supplied,  the 
program  computes  it.  If  UNIF='0'B 
the  initial  distribution  for  the 
iteration  must  be  supplied. 

'l'B  implies  that  ill  conditioned 
matrices  may  appear  and  inverts 
with  special  procedures.  'O'B 
uses  standard  procedured  and  will 
apply  in  most  cases. 


THE  PARAMETER  LIST  MUST  BE  FOLLOWED  BY  A  SEMI  COLON  ; 
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(2)  Factor  Nantes 

This  segment  is  used  only  if  FACT0RS>1.  Each  factor  name 
in  '  'is  preceded  by  FACNAME(f)=  where  f  is  the  factor  num¬ 
ber.  For  example,  for  a  2x2x2  table  where  the  first  factor  is 
time,  the  second  factor  is  cutting  and  the  third  factor  is  mor¬ 
tality  we  have 

FACNAME ( 1 ) = ' TIME '  FACNAME ( 2 )=' CUTTING '  FACNAME (3) = ' MORTALITY ' 

This  segment  is  optional  and  if  used  must  terminate  with  ; . 

If  not  used  ;  must  still  be  supplied  only  if  FACT0RS>1.  If 
FACT0RS=1  no  factor  names  are  given  and  no  semi-colon  is  punched. 

(3)  Table  data  and  constraints 

In  this  segment  only  the  numerical  values  must  be  supplied 
following  the  indicated  sequence. 

LEVELS .  If  FACTORS >1  and  we  have  a  5x6x2  contingency  table 
then  the  numbers  562  are  punched.  If  we  had  a  4x3x2x2x2  con¬ 
tingency  table  then  the  numbers  43222  are  punched.  If  we 
had  a  12x2x2  contingency  table  then  the  numbers  12  2  2  are  punched. 
If  FACT0RS=1  no  values  are  punched. 

PARTITION  NUMBERS  If  NUMSET>1,  that  is,  k-samples,  then 
the  number  of  distinct  observations  or  cells  in  each  set  must 
appear.  These  will  add  to  the  number  of  columns  of  the  C-matrix. 

For  example,  if  NUMSET=3  with  16  observations  in  sample  1,  4 
observations  in  sample  2^ and  4  observations  in  sample  3  then 
the  numbers  16  4  4  are  punched.  (The  C-matrix  has  24  columns)*' 

If  NUMSET=4  with  two  observations  in  each  set  then  the  numbers 

2222  are  punched  (the  C-matrix  has  8  columns) . 
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The  B  or  C  matrix  by  rows.  The  B-matrix  if  BMAT='1'B, 
and  the  C-matrix  if  BMAT= ' 0 ' B . 

The  observed  values  must  be  punched  in  lexicographic  order 
corresponding  to  the  columns  of  the  C-matrix.  Observed  values 
of  zero  are  punched  as  0  but  the  program  automatically  treats 
them  as  0.000001. 

NI9 .  This  is  supplied  only  if  INTERNAL™  1 0 ' B.  The  number 
of  values  must  be  the  same  as  CNSTRNT=m,  that  is,  the  number 
of  rows  of  the  C-matrix. 

The  initial  distribution  for  the  iteration.  To  be  sup¬ 
plied  only  if  INTERNAL™ '  1 '  B  and  UNIF='0'B. 

Remarks  In  the  cases  when  INTERNAL™ 1 0 ' B ,  the  output  in- 
2 

eludes  X  ,  the  minimum  modified  chi-squared  value  (the  quad¬ 
ratic  approximation  to  2I(x*:x)j  ttxe  miJ1^uin  n>Qdif  led  chi.- 

squared  estimates  which  are  used  as  the  initial  values  in  the 

2 

iteration,  since  they  satisfy  the  constraints.  Both  X  and 
21 (x* : x)  where  x*  is  the  MDI  estimate  and  x  the  observed  values 
are  asymptotically  distributed  as  chi-squared  with  r=m-k  degrees 
of  freedom. 
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5 .  MATGEN 

/ 

MATGEN  is  a  computer  program  that  generates  and  provides 
punched  card  output  of  design  matrices,  the  B  or  C  matrices, 
for  use  as  input  for  the  programs  KULLITR2 ,  DARRAT ,  GOKHALE . 

We  recall  that  the  program  CONTABMOD  generates  the  design  ma¬ 
trices  for  models  fitting  various  sets  of  observed  marginals  for 
use  in  computing  the  tau  parameters  and  their  covariance  matrix 
as  part  of  the  program  output. 

By  considering  the  string  of  the  successive  rows  of  the 
matrix  as  made  up  of  vectors  of  appropriate  sizes  it  will  usual¬ 
ly  be  found  that  a  relatively  small  number  of  different  vectors 
have  to  be  assembled  to  compose  the  matrix. 

The  input  to  MATGEN  consists  of  two  segments.  The  first 
contains  parameter  values  and  these  must  include  parameter  name 
followed  by  =  .  The  second  segment  consists  of  a  set  of  numeri¬ 
cal  values  that  roust  be  entered  in  a  prescribed  order. 


(1)  Parameter  i,ist 


PARAMETER 

DEFAULT 

'  EFFECT 

ROWS=m 

1 

m  is  the  number  of  rows  of 
the  m  x  n  matrix 

COLS=n 

1 

n  is  the  number  of  columns 
of  the  m  x  n  matrix 

VECTSIZES=k 

1 

k  is  the  number  of  different 
size  basic  generating  vectors 

THIS  PARAMETER  LIST  MUST  TERMINATE  WITH  ; 
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(2)  Numerical  Values 

NUMBER  SIZE  LIST.  This  Is  a  list  of  ordered  pairs  of  numbers. 
The  first  of  the  pair  is  the  number  of  basic  vectors  whose 
size  (length)  is  given  by  the  second  of  the  pair.  For  example 
2  4  3  2 

means  two  basic  vectors  of  length  four  and  three  basic  vec¬ 
tors  of  length  two.  For  this  case  VECTSIZES=2. 

BASIC  VECTOR  LIST.  The  vectors  must  be  entered  according  to 
the  lengths  specified  in  the  NUMBER  SIZE  LIST.  All  vectors  of 
length  four  would  be  entered  first  followed  by  the  vectors  of 
length  two. 


GENERATION  LIST.  This  list  consists  of  pairs  of  numbers.  The 
first  component  of  the  pair  is  the  number  of  successive  occur¬ 
rences  of  the  vector  whose  ordinal  number  in  the  basic  vector 
list  is  the  second  component  of  the  pair. 

JCL  Instructions 

1.  //  Standard  Job  Card 

2.  //#EXEC#PL1X6 , DSN-' U.ST66 30. IRELAND' ,PROG=MATGEN 

3.  / /GO . PUNCH  #DD#SYSOUT=B , DCB= ( RECFM»FB , BLKS I ZE*  8  0 ) 

4.  //GO.SYSIN#DD#* 

5.  /* 
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Note  that  #  represents  a  blank  space.  Card  5  follows  the 
numerical  values  and  terminates  the  program. 


I 

( 

Example.  Suppose  we  want  to  generate  the  following  matrix  (Of 

course  we  would  not  use  the  program  for  such  a  matrix  but  would 

punch  it  directly.  However,  it  will  illustrate  the  procedure.) 

11001100 
10101010 
0  0  1  0  1  0 1 0  0 

0  0  0  0 1 1  1  0  0 

1  0  1  1  0  0  1  1 

11110000 
00001111 


EXEC  Cards 

ROWS=7  COLS=8  VECTSIZES=2  ; 
2  4  3  2 

1100  1010 
11  0  0  10 


2  1 
2  2 

14  1234  11 

15  11  3  3 

4  4  2  3 

/* 

f 

i 

| 
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Note  that  the  vecters  in  ordinal  number  are 

1st  1100 

2nd  1010 

3rd  1  1 

4th  0  0 

5th  1  0 

It  is  not  necessary  that  the  elements  of  the  matrix  con¬ 
sist  only  of  0's  and  l's.  Negative  values  may  occur  also.  A 
vector  may  be 

.833333  .833333  0  0 

or 

0-10-1 

or 

0  12 

etc.  depending  on  the  problem  requirement. 
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7.  No  Interaction  on  a  Linear  Scale 
in  a  2x  2  x  2  Contingency  Table. 

1.  Minimum  discrimination  information  estimation. 

Consider  the  population  2x2x2  contingency  table  1 


i=l  A 
i=2  a 

The  experimental  procedure  selects  a  fixed  number  of 
observations  under  the  four  possible  combinations  of  the 
factors  (B,0),  (C,y)  and  determines  the  number  of  occurrences 
of  (A, a)  for  each  case.  In  effect  then  the  procedure  is 
examining  four  binomials  with 

(1)  P(ljk)  +  P (2 jk)  =  1,  j«l,2,k  =  1,2. 

The  corresponding  observed  values  are  shown  in  table  2. 
It  is  desired  to  test  whether  the  observed  values  are 
consistent  with  a  null  hypothesis  of  no  interaction  on 
a  linear  scale, 


Table  1 


B  i= 

1 

. .  .  i  1= 

=  2 

C  k=l 

Y  k=2 

C  k=l 

Y  k=2 

P(lll) 

P(112) 

P  (121) 

P (122) 

P (211) 

P  (212) 

P (221) 

P (222) 

2 


Table  2 


that  is 


(2)  Hq :  P(lll)  -  P (112)  =  P (121)  -  P (122) 

or  P(lll)  -  P (112)  -  P (121)  +  P  (122)  =  0. 


We  shall  determine  estimates  for  the  cell  entries 
subject  to  the  null  hypothesis  and  compare  the  estimated 
and  observed  values.  The  estimated  table  is  given  in 
table  3  where  the  X's  are  to  be  determined. 


Table  3 


We  shall  use  the  principle  of  minimum  discrimination 
information  estimation  and  thus  determine  the  X's  which 
minimize 
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x(lll)  +  A  x  (211)  -A. 

■"  (x'llDiA^lln  x(Tri) - h  (*(211)  &n  x('2Il) - 

x(112)  +  X2  x(212)-A, 

+  (x(112)  +  A2)  ftn  xflXSl - (x  (212)  -  X2)  in  - 


x(121)  +  A. 


x  (212)  -  X. 


(3V  +(x(121)  +  A3)  In-  fiJIT  (x(212) -X3)  in 


x(122)  +  X4 


x(222)  -X, 


+  (x(122)  +  X4)  in'  -  ^33^ - f  ^x(222)  -X^)  in  -jyjyjy 

xdlD  +  Xj^  x  (112)  +X  2  x  (121)  +X  3  x  (122)  +X  4 


^T(  x(.ny  "  '  x (TT5T 


X 


T75T5  +  T(T5T5  } 


where  x  is  a  Lagrange  undetermined  multiplier  and  (2) 
reflected  by  the  condition 

x(lll)+X1  x(112)+X2  x  (121)  +X  3  x(122)+X4 

(4)  .-T;.ni - - xrziy—  +  ot  "  = 


Differentiating  (3)  with  respect  to  X^#...,X4  leads 

to  the  "normal"  equations 

r  xtllD+X,  x(211)-X1 
*n  x(iTT) - £n  x (2il)  '  ■  +  xl.nT  ' 


2  x(212)-A2 


(5) 


x(112)+X_ 

1111  X(lli)  ln  x  (212) 


iTHTST  =  0  ' 


x (121) +X3  x (221) -A3  t  * 

ln  xd2i)  ln  x(i2i)  xTT5TT  =  0  ' 

x(122)+X4  x(222)-X4  t 

x(15'2F) - ln~ZT TZTT~  +  x 227  ~  °* 


There  are  a  number  of  different  iterative  approaches  to 
determine  the  solution  to  (5)  but  our  interest  here  is  to 
examine  the  relation  of  an  approximate  solution  to  other 
proposed  methods. 
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Assuming  that  the  ratio  of  the  Vs  to  the  observed 


values  are  small/  we  use  the  approximations 


xCllll+X-L  X1 


xCZID.-x, 


*  xtiil)  »  *n 


ronr 


z  etc. 


in  (5)  and  get 

f  X1  X1  T 

xTiHT  +  xTUTT  +  x(TU') 

X  2  A  2 

xTTTTr  +  xT2T2T  "  xTTTI) 


=  0  =  X, 


0  =x. 


X(.ll) 


x(.12) 


+  htitt 


3TTI2T 


x(l2l)  T  x (221)  "  x( .21) 

A.  X, 

4  _ 4  _ T 

x  (122)  X(222l  x(:22) 


0  -X. 


=  0  =x , 


x(.21) 


x (. 22) 


x  ( .  21)  ' 


xf.  52)  * 


From  (6)  and  (4)  we  have,  introducing  the  notation 
x(lij)  =  x(.ij)p(ij)  ,  x(2ij)*=  x(.ij)q(ij)  ,  p(ij)  +  q(ij)  *  1, 

rx  ,  .  xiiuixmi)  T,_p(11)q(11)T  , 

x  (x(.Il)r 


x(112)x(212) 
(x ( . 12) ) 2 

x (121) x  (221) 
(x(.21))2 


P  (12)q (12) x  , 


p  (21)  q  (21)  t  , 


x  (122 )  x  (222) 
(x  ( . 22) ) 2 


T  =  -  p  (22)  q  (22)  t  , 


>(11)  ~  p  (12 )  -  p  (21)  +  p  (22) _ 

(ll)qtll)  ,p(12)q(l2)  .p(2l)q(31)  ,p(22)q() 
x(.ll)  x  ( .  12 )  x(.2l)  x(.22) 
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Let  us  write 

x*  till)  =  xU.ll)  +  *lf  x*  C2111  -  x  C211)  -  \x  , 

C8) 

x*  (112)  =  xCH2)  +  \2,  x*  (212)  =  x  (212)  -  X2  , 

etc. 

where  the  X's  satisfy  (5). 

If  we  also  use  the  approximations 

x (111) +X.  x (211) -X. 

(9)  2{(x(lll)+X1)  in  (x(211)-X1)  In  • } 

x2 

_,2  ,  1  .  1  %  x(.ll)  _  A1 

"xi  (rnn T  +  xlmf  ■  xi  xdlDxdHT  "  xTrrnpTTTTqTnT ' 


then  we  get  for  the  minimum  discrimination  information 
statistic 


(10)  2I(x*:x)=2£  l  l  x*(ijk)*n 


~*2<  +  +  ^ +  ESl&&xi 

_ (p  (11)  -p  (12)  -p  (21)  +p  (22) )  2 _ 

"  p(ll)qUl)  |  p(12)^12)  (  p(21)Jdl)  , 

d(xorr r +  5T?nr1+A2tsmT2T +  x(5Tsvl+'"+x4^aW  +  5^222 y) 


Note  that  the  last  value  in  (10)  is  the  modified 
2 


Neyman  x 
UU  X2  =  I  (ob-°iP-> 
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and  indeed  the  equations  in  (6)  are  those  to  determine 

2 

the  minimum  modified  X  estimates.  The  next  to  last 
value  in  CIO)  is  the  statistic  given  by  Bhapkar  and 
Koch  (1968,  p.  116}  based  on  a  criterion  due  to  Wald. 
The  square  root  of  this  value  is  the  statistic  used  by 
Snedecor  and  Cochran  (1967,  p.  496). 

In  accordance  with  the  minimum  discrimination 
information  theorem  the  log-linear  representation  for 
x*(ijk)  is  given  graphically  as  in  figure  1  where  the 
interpretation  is 


rzn  =  Li  +  x/x  ( .11)  , 


An  xM2ii)_  = 


1T1TT  s  Ll  ' 


(12>fn*nrny-=  L2- vx(.i2)  , 


x* (212)  _  . 

Tcrnr  ~  h 


«-  x* (222)  _  . 

Cn  x(222J  ~  h  • 


Recalling  (8)  we  see  that  (12)  in  fact  leads  to 
(5) ,  If  we  write 


A*_x*(llll  x*  (112)  x*  (121) 

6  TcTTTTF  xT.'iTP  x(.JT5 


+TTrliy“p*(11)"p*a2)'p'  <2*>+p*(22>  » 


x( . 


Ill) 


nx 


^7§lf+  lBT7“p(ll)-p<12)-p,2l)+p<22)  • 
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then  as  shown  in  Kullback  (1959,  p.  101-106) 


0.4}  21  (x* ;xl  fi  C@*-0)2/o2  , 


2 

where  <j  is  determined  as  follows.  Let  T  denote  the 
8x5  matrix  in  figure  1,  that  is, 


(15)  T 


( 

\ 


1  0  0  0  l/x(.ll)' 

1  0  0  0  0 

0100  -l/x( .12) 

0  10  0  0 

0010  -l/x( .21) 

0  0  1  0  0  i 

0001  l/x( .22) / 
0  0  0  1  0  / 


and  Dx  the  8x8  diagonal  matrix  with  entries  x(ijk) , 
that  is, 


/ 


x(lll)  0 
0  x (211) 


(16)  Dx  = 


x(112) 


x(212) 


V 


x(121) 


x (221) 


x (122) 


x  (222)> 


Compute  the  5x5  matrix  S  =  T'D^T  and  partition  it  as 
follows 


,  Sn  is  4  x  4,  S22  is  1  x  1, 
S21  =  S^2  is  1  x  4  , 
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then  o  in  (14)  is  given  by 
(18}  o2  =  S22  2 1-1 1-1 2  * 


It  may  be  verified  that  this  results  in 

U9)  a2  =  x/111)x(2ii)  ,x(112)x(212)  , x (121) x(221)  |  x(122)  x  (222) 

(x(.ll))*  (x(.12))3  (x( . 21) ) 3  (x(.22) ) 3 

_  p (11) q (11)  ,  P  (12) q (12) .p(21)q(21)  .p(22)q(22) 

~  xTrny  ■  f  "xT.Dy  ■  +  xi.zi)  +  x&ih  • 


But  0*  in  (13)  is  zero  and  we  see  that  (14)  is  indeed 
the  next-to-last  value  in  (10) .  It  is  interesting  to 
note  that  2l(x*:x)  can  be  approximated  without  necessarily 
computing  the  values  of  x*(ijk). 
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W e  shall  illustrate  the  preceding  discussion  by  Bartlett's 
data  on  root  cuttings  used  also  as  an  example  by  Snedecor 
and  Cochran  (1967),  Bhapkar  and  Koch  (1958),  Berkson  (1972). 

The  following  from  Bartlett  (1935),  ■eonfcingeney  table  inter  »- 
aotinna,  -J  Rey- Statist  See.  Buppl-r;— ^ — 2+8-*R&2L,  who  refers 
to  data  from  Hoblyn  and  Palmer  is  the  result  of  an  experiment 
designed  to  investigate  the  propogation  of  plum  root  stocks 
from  root  cuttings.  There  were  240  cuttings  for  each  of  the 
four  treatments. 


At  Once 

3=1 

In  Spring 

3  =  2 

Long 

Short 

Long 

Short 

k=l 

k--2 

k=l 

k=2 

Dead  i^l 

84 

133 

156 

209 

Alive  i*2 

156 

107 

84 

31 

240 

24  0 

240 

240 

From  (7)  it  is  found  that  t-  4  (240) 2/46918 ,  X^=-l .  117.183 , 

X2=l. 213266,  X^=l. 117183,  X ^=-0 . 552368 ,  and  hence  the  minimum 


2 

modified  x  estimates  a::e: 


3 

=  1 

2 

k=l 

V.=  2 

X 

ii 

*— • 

k=2 

i=l 

82.88281? 

134.213266 

157.11^183 

208.447632 

i=2 

157.117183 

105. 786734 

82.88?817 

31.552368 

From  (18)  it  is  found  that  21  (x*.x^  is  approximately  0.08184492, 
1  degree  of  freedom. 
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Bartlett's  root  cutting  data  was  also  used  to  illustrate  other 
computer  programs.  The  input  cards  for  KULLITR2  were 

TITLE  -  'BARTLETT'S  ROOT  CUTTINGS' 

T0L1  -  .001  T0L2  -  .001  CNSTRNTS  -  5  OBS  -  8 
BMAT  -  'l'B  INTERNAL  -  'O'B  NUMSET  -  4  FACTORS  -  3  ; 

FACNAME(l)  -  'TIME'  FACliAME (2)  -  'CUTTING'  FACNAME ( 3)  -  'MORTALITY’ 


2 

2 

2 

2 

2 

2 

2 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 

l 

1 

1 

0 

-1 

0 

-1 

0 

l 

0 

84 

156 

133 

107 

156 

84 

209  :i 

960 

960 

960 

960 

0 

Note  that  the  computer  output  gives  2l(x*:x)  •  0.080972  and  the  minimum 

2 

modified  chi-squared  as  X  ■  0.081845.  The  computer  output  follows. 
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hak  il  f  r  r  •  s  mj(it  c  utt  l  r.os 

3  MU  lTh  TAoLKlTlMC-CiiTTlMu^KL-rTALllY 

rt  MAT*-  IX 

it  3  4  j  L 


f 


1 

2 
j 
4 
‘J 


1  l  J 

J  J  1 

J  J  J 

J  i )  i.) 

I  0  *  1 


0  o 

1  t 

0  i 

U  0 

J  -1 


')  U  u 

L  1  v. 

i  -  C 
u  1  i 
0  i  o 


whiliMT  (  J )  = 
*t  loM  (4  J  = 


Jc  J u 
0. 2*. >1)0  O' 
j .  2  »  3  J 
u.2  j  OUUU 

4.UJ0<jJJ 
4  .  tj  U  J  0  O  ’  J 
4 .  o  j j )C  J 


INV  kl  1 GHT (  I  )  = 
1NV  *F  IGF  1  (?)  = 
INV_>F 1 G  Ml ( 3)  = 

1  N  V  kf  IGt  T  f  D* 


f  CkSir.N  *  A  TR  lx 


1 

# 

4. 

3 

i> 

c 

( 

0 

1 

4 

4 

J  0 

0 

u 

Sj 

C 

? 

0 

0 

♦  4 

o 

(j 

J 

o’ 

J 

0 

u 

J  0 

4 

4 

J 

0 

4 

J 

3 

J  0 

0 

V 

H 

A 

L 

*♦ 

u 

-  ♦  0 

-4 

b 

H 

0 

hso  uc 

VAlufc  S 

i 

1  1 

xm  = 

04  •  wCOQ  J  J 

L \_. A (  l  ) 

= 

<♦.4308  1  J 

l 

1  2 

M<r  )  = 

1  jo . COuO JO 

l  ♦  A  (2  ) 

= 

j  .  U4  S’ 8  oc 

i 

2  1 

A  (  3 )  = 

i  3  J»  •  o  o  *-  0  J  vj 

L  A  (  j  ) 

= 

4  .  c  V  0  3  4  '  ■/ 

i 

l  L 

A  (4  )  = 

1j/.i  :00j-j 

t  >1  _  A  (  'i  ) 

- 

4  .  c  /  2  5  2  J 

2 

1  ) 

X  (  0  )  = 

1  3O.J0uU‘'O 

L  A  (  ’J  ) 

= 

•j  .o49B  5  c 

2 

1  t 

A  1C  )  - 

o‘«ii  jJUJb 

l1*.  A  (  0  ) 

- 

*♦  .  4  j  o  8  i  / 

2 

✓  l 

A  (  7  )  = 

2 1  4  .  li'JOUoU 

L  a  in 

- 

3  .  j  4  2  J  J  4 

2 

<  2 

X  (  0  )  = 

31.  OuOO Ju 

L  4_  A  (  4; ) 

= 

3.4  3  3S*0  / 

LCNS1  HIM' 

\'THH  t  I  1  |  =  SuC.-OO  J 
WMf  i/i(2)=  <:oC.  :oo)) ) 
•j  T  Ht  1/(3)-  4G0.UO')u  >  J 
,\THF  1  /  ( 4 )  =-  440.  1  X>  J  ) 
MHT  M5I=  3.0o0i>Ju 
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i  STlVATt  CF  Mhtl  A  AT  ClUNT= 
WTHAT(l)*  'jOO .  <)  OoOOO 
■NTHAT  <2  )*■  460.0  ) J300 
MHAHJl*  960 .  OOOOOO 
NTHAT  (<«)'  StoO.OOL'OOO 
■\j  1  H  A  T  ( 9  I  c  lfa.JJuOOO 


1 

\ 


S 


l 

t. 

J 

H 

6 

1 

3«4U 

0 

j 

0 

1 3-*4 

2 

0 

3a**0 

D 

0 

-c  12  6 

3 

u 

0 

3  04U 

U 

-2496 

4 

s J 

0 

0 

3  64  0 

3344 

6 

1  Ji‘i 

-2  l..  n 

-  t.  4  *.•  0 

3  3*t4* 

9312 

•>22.1 

1 

1  312 7 .066443 


S22.  l_  INV 


0.  0 JJ320 


DELTA  ll)  =  O.OJOOOU 
DEL  TA  ( 2  )  »  O.Cv.OOOU 
DELTA  (  J  )  =  O.OJJOOO 
DELTA(4)=  O.OOuOUU 
DELTA  (  5  )  =- 16. CQOOOu 


AS(J  = 


C  .08  1  046 


lSUMTE  LF  X  M  CLDNT  = 


1 
1 

1 

1 

2 
2 
c 
2 


1 

1 

2 

2 

1 

i 


1 

2 
1 
t 
1 
2 
1 
2 


XSIAMl) 
aSTAF  12) 
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We  also  Illustrate  the  first  two  iterative  steps  in  the  Darroch- 
Ratcliff  iterative  procedure  applied  to  Bartlett's  root  cutting  data. 
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The  DARRAT  computer  program  using  the  initial  distribution  as  the 
uniform  after  31  iterations  yielded  the  minimum  discrimination  information 
estimates 
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The  computer  output  using  the  G0K1IALE  program  on  Bartlett’s  root 
cutting  data  follows. 
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The  DARRAT  computer  program  using  the  initial  distribution  as  the 
uniform  after  31  iterations  yielded  the  minimum  discrimination  information 
estimates 
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8.  Further  Applications 

In  this  chapter  we  consider  six  examples  illustrating  the 
application  of  the  k- sample  and  the  general  linear  hypothesis 
techniques. 

Example  1.  Gail's  data.  This  example  illustrates  the 
procedure  for  getting  m.d.i.  estimates  under  hypotheses  about 
the  underlying  probabilities  of  two  contingency  tables  and 
testing  the  null  hypothesis.  An  analyses  of  information  table 
is  also  given  in  this  case,  including  a  subhypothesis .  Note 
the  difference  in  the  analysis  of  information  from  those  for 
the  fitting  problems. 
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Example 
Gail's  Data 


As  an  illustration  of  the  k-sample  approach  consider 
the  following  two  contingency  tables  (artificial  data)  con¬ 
sidered  by  Gail  (1974,  p.  97). 


20 

5 

5 

30 

15 

15 

2 

32 

6 

4 

2 

12 

10 

5 

5 

20 

26 

9 

7 

42 

25 

20 

7 

52 

a) 

b) 

Table  1 


The  problem  of  interest  was  whether  the  underlying  pro¬ 
babilities  1  "  the  two  tables  were  such  that  the  respective  mar¬ 
ginal  probabilities  of  the  two  tables  were  the  same.  If  so,  could 
it  be  a  consequence  of  the  fact  that  the  tables  were  homogeneous? 

Let  us  denote  the  observed  values  in  the  two  tables  as  in 
Table  2 


x(lll) 

x  ( 121) 

x(412) 

x(122) 

x  (113) 

x  (123) 

x(ll.) 

x (12 .  ) 

x ( 211) 

x (221) 

x (212) 

x  (222) 

x ( 213) 

x  ( 223) 

x(l.l) 

x(1.2) 

a) 

x  (1 . 3) 

N1 

Table  2 

x  ( 2 . 1) 

x (2 . 2) 

b) 

x  ( 2 . 3) 

For  the  hypothesis  that  the  respective  marginal  pro¬ 
babilities  are  the  same  the  basic  values  for  the  k-sample  ap¬ 
proach  follow. 
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0) 

x  (to) 

£n  x(to) 

111 

1 

20 

2.995732 

N1 

= 

42 

112 

2 

5 

1.609438 

N2 

= 

52 

113 

3 

5 

1.609438 

N 

= 

94 

121 

4 

6 

1.791759 

W1 

= 

42/94  =  0.446808 

122 

5 

4 

1.386294 

W2 

= 

52/94  =  0.553191 

123 

6 

2 

0.693147 

V1 

= 

l/wx  =  2.238094 

211 

7 

15 

2.708050 

V2 

= 

1/W2  =  1.807692 

212 

8 

15 

2.708050 

213 

9 

2 

0.693147 

221 

10 

10 

2.302585 

222 

11 

5 

1.509438 

223 

12 

5 

1.609438 

The  B  matrix  for  and  the  values  of  0_  and  N0_  are  given 
in  Table  3. 


/  w,  0  0  0  0  0 

'  1 


0  0  0  0  0 


*1 


0  wx  0  0  0 
0  0  0  0 
0  0  0  0 
0  0  0  0  w 


Wj  0  0  0  0  0 

0  vs?2  0  0  0  0 

W2  =i  0  0  w2  0  0  0 

j  0  0  0  w2  0  0 

I  0  0  0  0  w2  0 


0  0  0  0  0  w2 


w 


c 


— i 

o  w2/ 

BW-1, 


is  2  x  12,  C2  is  3  x  12 
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The  C  matrix  is  obtained  by  multiplying  all  the  elements 
in  the  first  6  columns  of  the  B  matrix  by  2.238094  and  by  mul¬ 
tiplying  all  the  elements  in  the  last  6  columns  of  the  B  ma¬ 
trix  by  1.807692. 


C  x 


N  <p 


9.296700  / 
12.998163  / 
r16. 010971/ 


t 


S  =  C  D  C'  =  f— 11  -12 

x  s  s 

'■—21  -22 


-22.1  ^—22  “  —21—11—12^ 


-1 


0.012198 

-0.001927 

-0.001776 


-0.0Q1927 

0.022284 

0.017520 


-0.001776' 

0.017520 

0.027101 


d 


-9.296700' 

-12.998163 

16.010971 

x 


The  minimum  modified  x  value,  the  quadratic  approxima¬ 
tion  to  2l(x*:x)  is  X2  =  d'S^  1d  =  4.512,  3  D.F. 


After  3  iterations  the  values  of  the  minimum  discrimi¬ 


nation  information  estimates  are  as  follows. 


to 

x*  (to) 

Jin  x*(to) 

111 

1 

16 . 50 A 

2. 803630 

112 

2 

6.460 

1. 866627 

113 

3 

4.042 

1.396620 

121 

4 

6.320 

1.843785 

122 

5 

6.604 

1. 887611 

123 

6 

2.064 

0.724458 

2ll 

7 

18.263 

2.904873 

212 

8 

12.705 

2.541984 

213 

9 

2 . 4"6 

0,906704 

221 

10 

9.996 

2.302228 

222 

11 

3.477 

1.246192 

223 

12 

5.083 

1.625813 

287 


5 


Tt  is  found  that  2I(x*:x)  =  4.333,  3D.F.  We  now  pro¬ 
ceed  to  test  the  hypothesis  H 2  that  the  two  contingency  ta¬ 
bles  are  homogeneous.  The  B  matrix,  0_,  and  N3_  for  II 2  are 
given  in  Table  4. 

Using  the  B  matrix  of  Table  4  we  have 

C~  BW_1,  C  =^§lj'  Cx  is  2  x  12,  C2  is  5  x  1"  , 

C  x  =  %  =  /  94  \ 

'94  } 

I  17.646500 
I -15.924902 
7.575083 
-4.648350 
\  -0.086081  / 

—22  1  now  a  5  x  5  matrix,  we  j.nit.  the  detailed  values 
and  X2  =  d's“2  =  9-300,  5  D.F. 

After  3  iterations  the  values  of  the  minimum  discrimina¬ 


tion  information  estimates  are: 


CO 

x*  (00) 

in  x*(oo) 

111 

1 

15.901 

2.766356 

112 

2 

8.559 

2.146348 

113 

3 

2.808 

1.0)2321 

121 

4 

7.419 

2  J04U0 

122 

5 

4.219 

1.  439503 

123 

6 

3.095 

1.129803 

211 

7 

19.686 

2.979930 

212 

8 

10.596 

2.360522 

213 

9 

3.476 

1.245895 

2  21 

10 

9.183 

2.217686 

;  22 

11 

5.723 

1.653077 

223 

12 

3.832 

1.343370 
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It  is  found  that  under  H2  2l(x*:x)  =  9.008  5  D.F.  If 
we  denote  the  m.d.i.  estimate  under  the  marginal  homoge¬ 
neity  hypothesis  by  xj*  and  under  the  homogeneity  hypo¬ 
thesis  H2  by  x*,  then  we  may  summarize  the  results  in  the  Ana¬ 


lysis  of  Information  Table  5. 

Analysis  of  Information 

Component  due  to _ Information _ D.F. 

h2  21  (x*  :  x)  =  9*008  5 

H.  21 (x*  :  x*)  =  4.675  2 

1  H  M 

21 (x*  :  x)  =  4.333  3 

M 

Table  5 


We  see  that  the  tables  are  homogeneous;  hence  the  mar¬ 
ginals  are  also  homogeneous. 

Note  that 

2  Zx*  Jin  m  2  Zx*  Jin  XH  2  Zx*  Jin  , 

X  xj  X 

But  x*  also  satisfies  the  restraints  for  x*  (homogeneity  im- 
plies  marginal  homogeneity)  hence 
2  Zx*  Jin  =  2  Zx*  Jin 


x  x 

and  we  have  the  analysis  as  in  Table  5. 


The  statistics  given  by  Gail  (1974)  are  the  same  as  the  X2  va¬ 
lues  gHiea  iibove. 
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Example  2.  Gokhale  discrete  distributions.  This  example  illus 
trates  the  application  of  the  k-sample  procedure  to  test 
hypotheses  about  the  means  and  variances  of  two  discrete 
distributions,  not  in  the  form  of  contingency  tables.  An 
analysis  of  information  table  is  given. 


10 


w= 


1 

C=BW~  = 


3  3  3  0 
0  0  0  1.5 
036  2.25 


t 


x  ( 1 )  =  6 

x (2)  =  18 
x (3)  =  9 

x  =  x  (4)  =  24 
x  (5)  =  3 

x  (6)  =  72 
x (7)  =  48 


In  x ( 1 )  =  1.791759 
In  x  (2)  =  2.890371 
In  x  (3)  =  2.197225 
In  x  (4)  =  3.178054 
In  x  ( 5)  =  1.098612 
In  x (6)  =  4.276666 
In  x (7)  =  3.871201 


S 


C  D  C 
- x  — 


(540  0  0 

0  270  81 

0  81  1309.5 


—2  2 


1309.5  -  (0  81) 

1285.199951  , 


/ 1/540  0  \(  o\ 

^0  1/27  Of  (8  ly 


-1  =  0.000778,  A  = 

S  2  2  .  1  - 


d  =  -54  , 


X2  =  (-54) 2 (.000778)  =  2.269,  1  D.F. 
After  two  iterations  there  is  obtained 


X*  (1) 

* 

7.618 

In 

X*  (1) 

=  2.030505 

-2 

x 

7.618 

x*  (2) 

s 

20.180 

In 

X*  (2) 

=  3 .  C  0  4  o  7  3 

-1 

X 

20.180 

X*  (3) 

= 

8.909 

In 

X*  (3) 

=  2.187082 

0 

X 

8.909 

X*  (4) 

= 

20.978 

In 

X*  (4) 

=  3.043467 

1 

X 

20.978 

X*  (5) 

= 

2.315 

In 

X*  (5) 

=  0.839582 

2 

X 

2.315 

-15.236 

-20.180 

0 

20.978 

4.630 
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X*(6)  =  66.538 
X* ( 7 •  =  53.462 


In  X*  (6)  -  4.197771 
In  X* (7 )  =  3.978971 


-1.5  x  66.538 
-1.5  x  53.462 


-99.807 

80.193 


21  (X* : X)  =  2.248,  1  D.F. 

(-15.236  -  20.180  +  0  +  20.978  +  4.630)/60  =  -0.1635 
(-99.807  +  80.193) /120  =  -0.1635. 

Under  H2  the  restraints  are  Bj?  =  6_  with 


1111 
0  0  0  0 
-10  12 
10  14 


0 

1 

1.5 

-2.25 


0 

1 

•1.5 

-2.25 


fotc  that  the  last  row  of  the  matrix  derives  from 

( -  2 )  2  iJ  i  ( -  2 )  +  (-1)  2  P  i  (-1)  +  0^(0)  +  12P,(1)  +  22Pj  (2) 
(-1.5)  +  (1.5) 2 P 2  (1.5))  . 


(  (-1.5)  2 P ; 


C=BW 


/ 1 8  0 
NO  =  180 

'  VS 


-6  -3 

12  3 


,  CX  =  N4>  = 


3  0  0 

0  1.5  1.5 

6  2.25  -2.25 

12  -3.375  -3.375 

/  180  \ 

180  , 

1  54  / 

\-171  / 


S  =  CD.  C'  = 


!  540 

0 

0 

702 

1  ° 

270 

81 

-607.5 

° 

81 

1309.5 

-344.25 

\702 

-607.5 

-344.25 

3040.875 

.5 

-344.25  > 

1  /  o 

>  -  \702 

81  Vs 

-607. 5A 

i.25 

3040.875/ 

40  0 

0  270 


S)'( 


0  702  \ 

81  -607.5/ 


1285.199951  -162.0  \ 

-162.0  761.399902/  f 
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i  /. 000800  . 000170N 

si 2 .  ,  =  000170  .001350/,  A  = 


2  /.  000800  .000170  \/-54\ 

X  =  (-54,171)  \.  000170  .001350  A171/=  38.652  ,  2  D.F. 

After  four  iterations  there  is  obtained 


X*  (1) 

= 

18.134 

Jinx*  (1)=2. 897783 

-2  (18. 134) =-36. 2 68, 4  (18. 134) =72. 536 

x*  (2) 

= 

13.081 

Jinx*  (2)=2. 571174 

-1(13. 081) =-13. 081, 1(13. 081) =13. 081 

x*  (3) 

= 

4.000 

Jinx*  (3)  =1 . 386189 

0(4)  =  0  ,0(4)  =0 

x*  ( 4) 

== 

16.586 

Jinx*  (4)=2. 808560 

1 (16. 586) =16. 586,1 (16. 586) =16. 586 

x*  (5) 

= 

8.199 

Jinx*  (5)=2. 104045 

2 (8. 199) =16. 398, 4 (8. 199) =32. 796 

x*  (6) 

= 

70.910 

Jinx*  (6)  =4 . 261405 

-1.5  (70. 910)  =-106. 365,  (-1.5)*  (70.910) 

x*  (7) 

= 

49.090 

Jinx*  (7)  =3 . 893661 

=159.548 

1.5  (49. 090)  =73. 635,  (1.5)  (49..  090) 

=110.453 

21 (x* : x) 

=29.546,  2  D.F. 

(-36.268-]3.081+16.586+16.398)/60=-0. 2728, (-106 . 365+73 . 635 ) /120=-0 . 2728 
(72.536+13.081+16.586+32.796)/60=2.2500, (159 . 548+110 . 453 ) /120=2 . 2500 


We  may  summarize  in  the  analysis  of  information  table. 


Analysis  of  Information 


Component 

due  to 

Information 

D.F. 

H2 

21 (x* : x) =29 . 546 

2 

2 

H2-Hl 

(Effect) 

21  (x* : x* ) =27 . 298 

2  1 

1 

H1 

21 (x* : x) =2 . 248 

l 

1 

We  reject  the  hypothesis  H2  but  accept  the  hypothesis  H-^ . 
The  effect  of  the  differences  in  the  variances  is  significant. 

We  also  used  the  Darroch-Ratclif f  iterative 

scaling  procedure  for  this  example. 
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Example  3.  Marginal  homogeneity  of  an  rxr  contingency  table. 

This  example  illustrates  the  application  of  the  k-sample 
procedure  to  a  set  of  data  previously  estimated  using  a  different 
algorithm.  It  also  serves  as  an  introduction  to  the  next 
example.  It  points  out  a  case  in  which  the  II-distribution  is  not 
the  uniform  distribution  and  shows  the  estimate  to  retain 
properties  of  the  original  observations  not  involved  in  the  null 
hypothesis.  For  applications  of  the  notion  of  marginal  homo¬ 
geneity  to  higher  order  contingency  tables  see  Kullback,  1971a, 

1971b.  The  latter  paper  includes  an  example  of  the  quadratic 

★ 

approximation  to  21 (x  :x)  . 
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Example 

Marginal  Homogeneity  of  an  r  x  r  Contingency  Table 


In  the  paper  "Symmetry  and  marginal  homogeneity  of  an  r  x  r 
contingency  table,"  by  C.T.  Ireland,  H.H.  Ku,  S.  Kullback  Jour¬ 
nal  of  the  American  Statistical  Association,  Vol.  64  (1969) , 
1323-1341  the  principle  of  minimum  disrriminationinformation  es¬ 
timation  was  applied  to  obtain  RBAN  estimates  of  the  cell  frequen¬ 
cies  of  an  r  x  r  contingency  table  under  hypotheses  of  either  sym¬ 
metry  or  marginal  homogeneity. 

The  procedures  were  illustrated  with  data  from  case-records 
of  the  eye-testing  of  employees  in  Royal  Ordnance  factories  ana¬ 
lysed  by  A.  Stuart. 

Table 


7477  Women  Aged  30-39;  Unaided  Distance  Vision  x(ij) 


— — ^jgftEye 
Right  Eye  ^  - 

Highest 

Grade 

Second 

Grade 

Third 

Grade 

Lowest 

Grade 

Total 

Highest  Grade 

1520 

266 

124 

66 

1976 

Second  Grade 

234 

1512 

432 

78 

2256 

Third  Grade 

117 

362 

1772 

205 

2456 

Lowest  Grade 

36 

82 

179 

492 

789 

1907 

2222 

2507 

841 

7477 

We  shall  supplement  the  discussion  in  Ireland  et  al.  (1969) 

by  using  the  single-sample  algorithm  to  derive  the  m.d.i.  esti- 

2 

mates  as  well  as  the  minimum  modified  x  estimates  and  relate  the 
results  to  values  given  by  A.  Stuart,  "A  test  for  homogeneity  of 
the  marginal  distributions  in  a  two-way  classification,"  Biometrika > 
Vol.  42  (1955) ,  412-416  and  V.P.  Bhapkar,  "A  note  on  the  equiva¬ 
lence  of  two  criteria  for  hypotheses  in  categorical  data,"  Journal 
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*  ^  Tj=  -0.050000.  The  T  design  matrix  is  of  course  the  same  as  C' . 

2 

[  Bhapkar's  test  statistic  is  the  minimum  modified  x  and  he 

y2 

gave  *b  =  11.976  with  3  D.F.  He  did  not  give  the  minimum  modi- 


of  the  American  Statistical  Association,  Vol.  61  (1966) ,  228-235. 

The  reader  is  referred  to  Ireland  et  al.  (1969)  for  further 
discussion  and  references.  The  basic  table  will  also  be  used  to 
illustrate  the  k-sample  algorithm  applied  to  incomplete  data.  We 
remind  the  reader  that  the  graphic  form  of  the  log-linear  repre¬ 
sentation  using  (^(lo)  =  L,  C2  (oj)  =  T^  (w)  ,  C3(w)  =  T2(w),  C4  (W)  =  T3(w) 
presents 

£n  lPw\  =L  +  W“)4  t2T2(u3)  +  1 3T3 {u) 

where  from  the  output  L=0. 000805,  t^=-0 . 159043 ,  i2=“0. 105379 , 


fied  x  estimates.  The  program  yields 


A. 


11.975717.  Stuart 


2  j  i 

gave  no  estimates  either  and  he  used  as  his  statistic x  S  —  —22—  =  11*957“ 
Stuart  estimated  the  covariance  matrix  of  the  d's  under  the  null 
hypothesis.  From  the  computer  output  we  see  that  S22  and  S22  ^ 
are  not  very  much  different  in  this  case. 

From  the  log-linear  representation  of  the  m.d.i.  estimate  we 
see  that  associations  in  the  original  table  are  the  same  as  in  the 
estimated  table,  thus 


x* (ii) x* ( j j )  _  x(ii)  x(jj) 

*n  -  ln  x (i j )  x(JI)  ’ 


x*  (ij )  x*  (44)  _  XU]}  xm) 

fcn  x*  rn7x*TTjr "  *n  xCitrxT^T 


x(ij)  x(44) 


298 


I 


Based  on  the  values  Xe  =  11*957,  Xo  =  11*976  with  3  D.F.  Stu- 

b  d 

art,  and  also  Bhapkar,  rejected  the  null  hypothesis  of  marginal  homo 
geneity.  We  find  that  2I(x*:x)  =  12.017,  3  D.F.  and  reject  the 
null  Hypothesis  of  homogeneity. 

We  remark  that  the  discussion  in  Ireland  et  al.  (1969)  used 
a  different  iterative  algorithm. 


Log-linear  representation 
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Example  4.  Several  samples,  incomplete  data. 

This  example  uses  the  complete  contingency  table  of  the 
\  preceding  example  and  row  and  column  marginals  only  of  additional 

samples.  The  example  illustrates  the  application  of  the  procedure 
to  samples  which  may  include  fragmentary  data. 
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Example 


Several  Samples,  Incomplete  Data 

We  shall  illustrate  the  k-sample  algorithm  of  testing  several 
samples  with  incomplete  data  in  terms  of  a  specific  sample.  In 
Table  1  the  7477  observations  in  the  4x4  contingency  table  are 
Stuart's  data,  which  we  have  already  examined  under  the  null  hy¬ 
pothesis  of  marginal  homogeneity. 

fl 

The  remaining  1100  observations  are  artificial  data  for  600 
women  for  whom  only  left  eye  vision  was  reported  and  500  women  for 
whom  only  right  eye  vision  was  reported.  It  will  be  presumed  that 
the  incomplete  data  for  women  with  vision  classified  only  for  one 
eye  arose  in  a  completely  random  manner  which  was  statistically 
independent  of  the  true  classification  of  their  vision  with  respect 
to  both  eyes.  This  assumption  allows  us  to  say  that  the  marginal 
probabilities  pertaining  to  left  eye  vision  and  right  eye  vision 
for  women  classified  on  both  eyes  are  the  same  parameters  as  the 
probabilities  pertaining  to  left  eye  vision  for  women  only  for  the 
left  eye  and  to  right  eye  vision  for  women  classified  only  for  the 

H 

right  eye  respectively  (Koch  et  al  1972.  p.  665,  666) . 
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The  results  for  the  k-sample  algorithm  computer  output  are 
summarized  in  Table  2,  in  which  we  also  give  the  values  derived 
by  Koch  et  al  (1972)  by  their  approach. 

We  also  estimated  this  set  of  d*ta  using  the  Darroch- 

Ratcliff  algorithm. 

In  view  of  the  small  values  of  the  test  statistics  with  6  D.F. 
we  accept  the  null  hypothesis  of  the  homogeneity  of  the  data  with 
respect  to  the  underlying  population. 

Using  the  m.d.i.  estimates  of  the  entries  in  the  cells  of  the 
complete  contingency  table  as  "improved"  values  over  the  original 
observations  we  repeat  the  test  for  the  null  hypothesis  of  margi¬ 
nal  homogeneity.  The  resulting  values  are  summarized  in  Table  2. 
There  is  no  change  in  our  inference  that  the  data  show  no  evi¬ 
dence  of  marginal  homogeneity. 

Table  3  gives  the  graphic  presentation  of  the  log-linear  re¬ 
presentation.  The  relationships  may  be  checked  using  the  appro¬ 
priate  values  from  the  computer  output. 

Table  4  lists  the  input  for  the  KULLITR2  computer  program. 
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Table  1 

UNAIDED  DISTANCE  VISION;  8577  WOMEN  AGED  30-39 

Left  eye 


Highest 

Second 

Third 

Lowest 

Sub- 

Right 

Total 

Right  Eye 

Grade 

Grade 

Grade 

Grade 

Total 

Only 

(1) 

(2) 

(3) 

(4) 

Highest  Grade(l)  1520 

266 

124 

66 

1976 

140 

2116 

Second  Grade 

(2)  234 

1512 

432 

78 

2256 

150 

2406 

Third  Grade 

(3)  117 

362 

1772 

205 

2456 

160 

2616 

Lowest  Grade 

(4)  36 

82 

179 

492 

789 

50 

839 

Sub  Total 

1907 

2222 

2507 

841 

7477 

500 

7977 

Left  Only 

160 

180 

200 

60 

600 

* 

* 

Total 

2067 

2402 

2707 

901 

8077 

* 

8577 

See  Koch,  G.G 

.  ,  Imrey ,  P 

.B . ,  and 

Reinfiwrt 

,  D.W. 

(1972)  , 

Linear 

model  analysis  of  categorical  data  with  incomplete  response  vec¬ 
tors,  Biometrics  28,  663-692..  in  particular  p.665. 


1 


I 


i 
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Table  2 


x  (to  ) 


x*  ((# ) 


x(uO 


x  (w  )  a 


1520 

1530.227 

1530.155 

1529.495 

266 

267.148 

267.151 

266.331 

124 

124.403 

124.405 

123.670 

66 

65.671 

65.643 

65.573 

234 

234.664 

234.676 

235.301 

1512 

1512.657 

1512.810 

1512.672 

432 

431.729 

431.773 

430.600 

78 

77.311 

77.282 

77.387 

117 

117.190 

117.195 

117.838 

362 

361.721 

361.751 

362.784 

1772 

1768.752 

1768.905 

1769.357 

205 

202.944 

202.863 

203.748 

36 

36.006 

36.007 

36.114 

82 

81.818 

81.822 

81.798 

179 

178.413 

178.422 

177.878 

492 

486.360 

486.142 

486.528 

140 

132.904 

132.898 

132.745 

150 

150.887 

150.899 

150.860 

160 

163.876 

163.884 

164.085 

50 

52.333 

52.320 

52.310 

153.919 

178.415 

200.880 

66.787 


153.915 

178.430 

200.897 

66.759 


153.966 

178.434 

200.736 

66.864 


1532.573 
253.107 
110.966 
55.529 
247.726 
1515.085 
408.454 
69.555 
130.215 
382.343 
1771.597 
193. 837 
41.662 
90.284 
186.975 
487.092 


x**  ) 


1531.372 

253.216 

111.552 

56.202 

247.955 

1513.898 

408.742 

69.857 

130.894 

382.648 

1770.209 

193.836 

42.121 

90.690 

187.084 

486.710 


21 (x* :x) 
=1.771 
6  D.F. 


X2=l .  76  4  x*=2 . 33  x2=H.741  2i(x**: 


6  D.F.  6  D.F.  6  D.F .1  3  D.F.  3  D. 

a)  See  Koch  et  al  (1972)  p.669  ’ 

b)  ,  c)  Using  "improved"  estimate  to  test  marginal  homogeneity 

b)  is  min.  mod.  X*  and  C)  is  ir.d.i. 


=11.730 
3  D.F. 
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Table  3 

Log-linear  representation 


■i  3 

cj 

L1  L2  L3 

T1  *2  T  3  T  4  T  5  1 6 

1  1 

1 

V1 

V1  V1 

1  2 

2 

V1 

V1  V1 

1  3 

3 

V1 

V1  V1 

1  4 

4 

V1 

V1 

2  1. 

5 

V1 

V1  V1 

2  2 

6 

V1 

V1  V1 

>2  3 

7 

V1 

V1  V1 

2  4 

8 

V1 

V1 

3  1 

9 

V1 

V1  V1 

3  2 

10 

V1 

V1  V1 

3  3 

11 

V. 

V,  V. 

1 

1  1 

3  4 

12 

V1 

V1 

4  1 

13 

V1 

V1 

4  2 

14 

V1 

V1 

4  3 

15 

V1 

V1 

4  4 

16 

V1 

1  . 

17 

V2 

”V2 

2  . 

18 

V2 

"V2 

3  . 

19 

V2 

-V2 

4  . 

20 

V. 

_  . 

.  1 

21 

V3 

~v3 

.  2 

22 

V3 

”V3 

.  3 

23 

v3 

"V3 

.  4 

24 

_ li. 

vl=1/wl  =  1- 147118 
v2=l/w2  =17.153992 

v3=l/w3  =14.294999 


£n  x*(l)=  v.L.fT.v.t 

imr  1111 

t4v1 


etc. 


£n  x*  (17)  =v^L0  -i,v, 
x(l7)  ^  1  * 


etc. 


£n  x*  (21)  =vnL,  -x^v0 

TTTT)  33  4  3 

etc . 


In  x* (1) -£n  x* (4) = 
x(l)  x(TT 

v,t>l=  £nx*(5)  -£nx*(8) 

1  “xT5)*  ~xT8T 

or 

)lnx*(l)x*  (8)=  Hnx ( 1)  x ( 8) 
rJ*  (4)  x*  (5)  x  ( 4 )  x  ( 5 ) 


etc. 

Certain  associations  are  retained. 
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Talble  4 

Input  for  KULLITR2  Computer  Program 

TITLE  =  'SEVERAL  SAMPLES*  TOLl  =.001  TOL2  =.001 
INTERNAL  ='0'B 

NUMSET  =  3  BMAT  =  ' l'B  CNSTRNT  =  9  OBS  =  24; 

16  4  4 


1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

1 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

1 

1 

1 

1 

1 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0- 

-1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

1 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0- 

■1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

1 

1 

1 

0 

0 

0 

0 

0 

0- 

■1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0- 

-1 

0 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0- 

-1 

0 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0- 

■1 

0 

1520  266  124  66  234  1512  432  78  117  362 
1772  205  36  82  179  492  140  150  160  50 
160  180  200  60 

8577  8577  8577  000000 
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1 

:).  ) 0 ' nj 0 0 

u . (  OO'lOU 

0  .  U  U  lijllu 

,L  .. uU L 

2 

J  .  u 30000 

u .  c J J jOC 

J . uuu CO v 

J  .  u  J  ■  c  'J  G 

j 

1  4  .'04  9  0  0 

1  4 . 2 S4  •  S S 

1  4  .  2  -7  4S  4  4 

i  4 . 2  j  t  9  9  9 

4 

u  .  0  J  u 0  0  J 

U  .  c  10000 

J  .  u  u  U  <  j  ■  J  G 

0  .  OuO  i.  0  0 

5 

: . 330000 

0  .  J 0  J  )  JU 

C 

0  .  JUU  OUU 

J  .  U'J-t  uuu 

0 .  uuOUol. 

U  .  0  NLuu 

7 

-  14 .2 04  SOS 

u .  C  J  J  jG L 

V7  ■  vC  J  J  '  7  V_ 

u.u'jvL  3 v. 

h 

J . JOG J JJ 

-  1  4 . 2  <4  40  S 

u.ccUeJu 

J. 

c. 

J . uJOOOO 

0  .  OU'JUOtt 

- 14.2949  i  S 

0  .  uUG'UU  U 

C n S rf R  etc  VALubS 

X  (  1  )  - 

13  2  0.  ^  uO  j J  0 

LV.Xi  1)  = 

7 .  j  2  u  o  o 

X  <  C  }  = 

2  6c  .00  )  0  u  0 

L  N_  X  (  2  )  = 

3 .4b 349c 

X(  3  i  - 

12  4  .  J  0  U  DuO 

L  \_A  (  j  )  = 

4 . d2Q2  82 

X  (  4  )  = 

Co  .  OuO  Juu 

L.\_X<  4)  = 

4  .  lbSco4 

X  (  3  )  = 

234.  UOJ.juO 

LN_  X  (  3 )  = 

3 . 4  5  332  1 

X  {  o  )  — 

1312. GOO JO 0 

L  N_  x (  o)  = 

7.321iaS 

X  (  7  )  = 

h32 . 0  0  0  0  0  0 

L  N_  X  (  7 )  = 

O .0co426 

X(  c)  = 

7b.  OuO  JL 0 

LN_a  (  a )  - 

4 . 3  3u  7 09 

X  (  Q)- 

1  1 7. JCOJOO 

L  N_  X  (  S  )  = 

4.7  c  2174 

A ( 10)= 

362.000000 
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f>  .291644 
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2  C 3 .000000 
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X  (  1  ft  )  - 
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LN..X  ( 1  6  )  = 
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4  .  S4  1  o-f  3 
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A  (  10)  — 

16u.  JuOOO'C 

L  N_  A (  19)  = 
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In_a  (2  D  = 

3.0  7  3173 

X  (  22  )  = 

Ifcu.^GOOuO 

IN_ a ( 22  )  = 

5.  19295  7 

X  (  2  j  )  = 

2.  C  C  .  0  0  0  J  o  0 

LN_X( 23)= 

3 . 2  9  o  3  1  7 

X(  24)  = 

6  C  .  C  C  JuUO 

L  is_  X  (  2  4  )  — 

4 . 0  3  4  34  4 

CONST  FA  IMS 
N  Thb  T  7 (  1  )  = 
NTHbTA  ( 2  )  = 
SITbET*  (  3)  = 
NTHb  T  7  (4  )  = 
NIHFT7 l 5  )  = 
NTHET /  l  t  J  = 
NT hb  T 7 (  7  )  = 
NTHLT7 ( 8  )  = 
NTI-ET7  (9)  = 


b3  77 .oOOGOO 
tJ 77 . uuOOOO 
tI5  77 •  OCOOOJ 
O.COOJJO 

0 .OUUUUU 
0 . C  C  0  wu  c 
0.000000 
0 .uOO JO0 
0 . oOu J JO 


bSl  Itf/1  E  c:r  MhcTA  AT 


N  THAT  1 1  )  = 
NTH/ST  (2  )  = 
NTtiAT  (3  )  = 
N  T  (■*  A  T  ( 4  )  = 
NT  HAT  ( 3 )  = 
N  1  HAT  (  0  )  = 
NTnAT  (  7  »  = 
NT  HAT  l  b  )  = 


b  3 7o . 99  CUSh 
i 3  7t.9921 Jo 
b  3  7  6.99oG94 
-134.634431 
14. 740504 
7  2  .  Of>2  19  J 
-SO  .  e4  e  3o 1 
-24.20440b 


CULM- 
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Reproduced  from 
best  available  copy. 
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DM  17(1)- 

•0,U0350o 

1)1  L  17  (  2  )  - 

J  .  C  J  7  o  1  ‘ 

'JH  17(3)- 

j  .  0  0  3  4  0  o 

DELTA  (  4  )  = 

134.054431 

UlLTA ( 5  ) - 

-  1  .  79  c  3  b  4 

DELTA  lo  )  = 

-  72 .682190 

OM.TA  (  7  )  = 

4  9  .  o  4  c  5  o  1 

JcLTA ( 6  )  = 

2  4 .204496 

DELTA (s  )  = 

-io.ti24U3o 

Xb4  = 

1  . 

76335  3 

l  b  T  I  R  A  T 

E  IE  X  Al  CULM  T  = 

1 

X  S  T  A  R  { 

1 )  = 

i  3  30 . 1  7  1  to  j  1 

L  X  S  1  AH  ( 

n  = 

7 

XSTAR  ( 

2)  = 

267.150674 

L  J_XST  AH  ( 

2)  = 

5 

X  S  1 7  k  ( 

2)  = 

124 .404  73  4 

LiM_X S T  AR  ( 

3)  = 
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A  S 1  A  R  ( 
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o5  .o430>32 

L  w_  X  S  T  A  k  ( 
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XS1AP  ( 
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234.674)  44 

L  iM_  AST  AR  ( 

5)  = 
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L  )  = 
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Li4_XST  AK  ( 
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XS17R  ( 

7)  = 

431 . 765237 

LM.ASTAR  l 

/)  = 

6 

XSIAR  ( 

6  )  = 

77.264515 

LM_XSTAk  ( 

6)  = 

4 

aSTAR  ( 

9  )  = 

117.193665 

LN_XSTAk  ( 

5  )  = 

4 

XS1AF.  (  10)  = 

36I .797555 

L\_AS 1  AH  ( iu)  = 

5 

XSIAP.  (11)  = 

l  766 .b091tC 

lh_xstak(  id  = 

7 

XSTAR ( 12) = 
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LN_  X  S  T  AH (12)  = 

3 

XST7R  (  13)  = 

36 . COc  3G2 
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ASIA!  ( 14  )  = 
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l.U_XSTAR  (  14)  = 
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LN_XST AH (  15)  = 
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LN_X  ST  AR (  16/  = 
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X  S  T  7  R  (15)  = 
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Lrj_XiTAIs  (  L5)  = 

3 

aSIAR (20)- 

52 .3430<i2 

L.VXSTAR  (2o)  = 

J 

XS1AR  (< 

1)  = 

153.095746 

L  ^_XS  T  AR  (  2 1)  = 

3 

ASTAR (22)= 

17d  .262632 

l  \l_  Xb  1  AR.  (  22  )  = 

3 

aSTAR  (4 

3)  = 

200 .7252j5 

L',_AST  AH.  ( 

23)  = 

3 

ASTAR (24) = 

67 ,096ol9 

L 1  -1  _  X  S  7  Ah  (  24)  = 

4 

^1  (Xb7AR:^)=  1.63/3  >3  7 


T/Ul  1  )  = 

r/u.(2 )  = 

T AU(  3  )  = 
r  a  u  (  m  i  = 
T  A'J  (  5  )  = 
T  A  U  (  C  )  = 


l  .  JO joo2 
0.002355 
0  .  JCU90 
0.010591 
C . )0o991 
C.JG7507 


j  j  J  1 3  l 
56/613 
3? 3391 
164^32 
9  5  6  1  4  6 
3  2  1  7  l  4 
067651 
39  7993 
7i_362  7 
0  9  0  5  4  6 
476107 
3U373 

3  d  3  C  4  4 

4  G  4  3-t  1 

1  0  1  4  5 

10o562 
6  SO  3-t  4 
UiLUo4 
056603 

5  5  7  o  5  7 
C36t  7  6 
163372 
3  C  i  5  3  7 

2  C  c  1 3  4 
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tSTIPATt  O 

X  AT  C  LU\ T  = 

1  3 

XS  TAP  ( 

1)  = 

U30  .ci  70S  1 

LN_XST Ak ( 

1)  = 

7.iiii72 

X  S  TAP  ( 

2)  = 

2  o  7  . 1  A  7  7  J  8 

L*N4_X  ST  Ar  { 

2 )  = 

3 . 3  ti  7o  02 

X  S  TAP  ( 

3  )  = 

124.4J307o 

L  '4_XST  AK  ( 

3  )  = 

*+  .o2ii2  7 

XSTA-v  1 

A)  = 

o  3  .  t  7  C  o  3  4 

L.4_>.  ST  Al<  { 

4 )  = 

4.1o4631 

XST7.P  ( 

5)  = 

4  3 1 . 6  c  4  1 4  4 

L  «M _ A  S  T  A  f<  ( 

8  i  = 

j  .  *+36  13o 

XSTAP  ( 

o)  = 

1 1 1 8  .68  7 '+71 

LN_X ST Ai:  l 

u  )  = 

7 .32ic24 

XSTAP  ( 

7 )  = 

Ail. 729  Jo4 

Li\_X  ST  AT  ( 

7)  = 

0.067749 
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Example  5.  Specified  log-linear  representation. 

In  this  example  the  problem  specifies  the  form  of  the 
log-linear  representation  and  consequently  the  design  matrix. 
The  general  linear  hypothesis  approach  is  necessary. 
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Example 

Specified  Log-linear  Representation 

D.V.  Gokhale,  "Analysis  of  log-linear  models”  Jour.  Royal 

Statistical  Soc.  Series  B  Vol  34  (1972)  p,  371-376  formulates  a 

problem  for  a  2  x  2  x  3  three-way  contingency  table  of  fitting  a 

model  such  that  the  log-linear  representation  is  of  the  form 

4n  x*  (i  jk)  *  L  +  (i  -  1)  +  (j  -  1)t^  +  (k  -  l)xk 

n  u  . , 

+  (i  -  1)  (j  -  1)T1]  +  (i  -  1)  (k  -  1)T  +  (j-i)(k- 

T  (i  -  1)  (j  -  1)  (k  -  l)xljk 

This  implies  that  the  graphic  version  of  the  log-linear  represen' 
tation  is  as  given  in  Fig*  1* 


Figure  1 
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The  observed  values  (fictitious)  are 


i 

j 

k 

x(ijk) 

ijk 

x  (ijk 

1 

1 

1 

58 

211 

75 

1 

1 

2 

49 

212 

58 

1 

1 

3 

33 

213 

45 

1 

2 

1 

11 

221 

19 

1 

2 

2 

14 

222 

17 

1 

2 

3 

18 

223 

22 

Gokhale  used  an  iterative  procedure  that  might  be  described 
as  a  "steepest  descent"  procedure.  We  shall  set  this  up  using 
the  k-sample  algorithm  (of  course  here  k=l)  and  using  the  uniform 
distribution  as  the  initial  distribution.  In  this  case  the  C  ma¬ 
trix  is  the  transpose  of  the  T  matrix  in  Fig,  1  and  is  given  a- 
gain  for  corvenience  in  Fig.  2. 

illllll222222 
j  111222111222 

k!23123123123 
uj!2  3456  7  89  10  11  12 

111111111111 
000000111111 
000111000111 
012012012012 
0000;  0000111 
0000  -.'  012012 

0000  .00001  2 

0000.  j  0  0  0  0  1  2 

Figure  2 
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i 

1 

1 

*  1 
1 
1 
1 


The  estimated  values  as  given  by  Gokhale  are 


j 

k 

x  *  ( i  j  k ) 

ijk 

x*  (ijk) 

1 

1 

59.73 

211 

74.97 

1 

2 

45.54 

212 

58.06 

1 

3 

34.73 

213 

44.97 

2 

1 

10.98 

221 

17.85 

2 

2 

14.05 

222 

19.29 

2 

3 

17.98 

223 

20.85 

The  goodness-of-fit  X  statistic  is  0.8083,  4  D.F. 


The  input  values  for  the  KULLITR2  computer  program  are 
given  in  table  1. 

The  input  values  for  the  DARRAT  computer  program  are 
given  in  table  2. 


/ 
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TITLE= ' GOKHALE  ANALYSIS' 

OBS=  12 
CNSTRNT=  8 
FACTORS=  3 

TOL  1=  .001  TOL  2=  .001  ; 

FACNAME ( 1) =  'I' 

FACNAME ( 2 ) =  'J' 

FACNAME ( 3)  =  'K' 

2  2  3 

111111111111 

000000111111 

000111000111 

012012012012 

000000000111 

000000012012 

000012000012 

000000000012 

58  49  33  11  14  18  75  58  45 

19  17  22 

Table  1 

Input  to  KULLITR2  Computer  Program 
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TITLE= * GOKHALE"S  ANALYSIS ' 
BLOCKS- 8 


TOLl= .001  TOL2-.001  CNSTRNT-8 

OBS=12  FACTORS— 3 ; 

FACNAME ( 1 )  —  1 1 '  FACNAME ( 2)  — 1 J '  FACNAME ( 3) K '  ; 

2  2  3 

11111111 
111111111111 
000000111111 
000111  0  0011  1 
01  2012  0  12012 
000  0  000  00  111 
000  000  0  120  1 
000012  00001 

065600000012 


58  49  33  11  14  18  75  58  45  19  17  22 


Table  2 . 

Input  to  DARRAT  Computer  Program 
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oGKt  ALL'  S  minIALVSI  S 
3  t-ACTIK  TABLE:  I  *  J*i\ 


C  OL  S  1  bis 

1  MATRIX 

1 

n 

c. 

3  4 

3 

6 

7 

ft  9 

10 

i  i 

1 

1 

1 

1  1 

1 

1 

1 

1  1 

1 

1 

2 

0 

J 

0  0 

0 

0 

1 

l  1 

1 

i 

3 

0 

J 

0  1 

1 

1 

3 

0  0 

1 

i 

A 

0 

1 

2  0 

1 

2 

0 

1  2 

0 

1 

3 

0 

0 

0  0 

0 

0 

0 

L  U 

l 

1 

6 

0 

0 

3  0 

0 

j 

3 

1  2 

0 

1 

7 

0 

0 

0  0 

1 

2 

0 

0  0 

0 

i 

ft 

0 

0 

0  0 

0 

i) 

0 

o  0 

0 

1 

UftSE  k VEL 

i  VALUES 

1 

1 

1 

X  ( 

1  )  = 

3  >3. 030000 

V'T.XI 

1  )  = 

4.0o0443 

l 

1 

2 

A( 

2)  = 

49. 0  03000 

LN_X  l 

2)  = 

3  •  o  1 1  ft  2  0 

l 

1 

3 

XI 

3 )  = 

3. 030000 

L  N  _  X  ( 

3  )  = 

3.496308 

1 

2 

1 

X( 

4 )  = 

1 1 . OOoOOO 

LN_X  l 

4)  = 

c.  39 7693 

1 

2 

2 

x< 

31  = 

14. 030000 

LN_X  I 

3)  = 

2.639037 

l 

2 

3 

XI 

o  )  = 

13. OOOOOO 

L  N_X  ( 

o  )  = 

A.ft^Ui 7 1 

2 

1 

1 

XI 

7 )  = 

/3 . OOOOOO 

L  N_  a  ( 

7)  = 

4.31  74ft  8 

2 

1 

2 

X'. 

U)  = 

38. OOoOOO 

LN_X  ( 

3)  = 

4  .  ub044  3 

2 

l 

3 

M 

9 )  = 

4  3 .  OOOOOO 

L)_X( 

9  )  = 

3 .  ft  0  ti  0  6  3 

2 

2 

i 

X  (  i  3)  = 

19. 300000 

L  N  _  X (  10)  = 

2.944439 

2 

2 

2 

XI  11)  = 

1  7.900000 

L  *_  X  l  i  1 )  = 

2.833213 

2 

2 

3 

XI 12)= 

22.000000 

LN_  X  (  12 )  = 

3.011043 

CONSTRAINTS 
NTHET  A (  1)  =  419.000 Oo3 
NT  hfc  T  A  (  2  )  =  2  36.  00  TO  JO 
NTritTAl3)-=  lOl.OUOOOO 
NTHtTA(4)=  374.000000 
NT  HE  T  A  I  5  )  =  06.000000 

NTHcTA<6)=  2 09. 00000 J 
NTHE  T  A  l  7  )  =  ill.  00)000 
NTriETA(ft)=  61.000000 


E  S  T  I  ii  A 1  L  OF  NTHETA  AT  COUNT: 
N  T  H  A  T  (  1  1  =  410.999  736 
M  HA T ( ^ I =  2U9. 499939 
NTFiATISl  —  2u 9. 499939 
NT  HA  i  141=  41ft.  199  756 
NTnAT ( 3 1=  l j't. 749969 
NTHATl6»=  209.499939 
N l HA  T I  /  )  =  2 j9. 499933 
NihATIft)=  104.7499o9 
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12  1  AST  AR.  (  h  J=  14.612383 

1  2  2  XST  AR  (  ^i=  lo .  18  3000 

12  3  XSTARl  o)  =  i7.B:>6044 

2  l  1  aSIAFI  /)=  90.060471 

2  1  2  XSIAM  (j)=  Jd.  0 00 948 

2  1  3  A  SI  Ah'  (  ';)  =  3'3.1h1139 

2  2  1  XSTARl  l  11  =  l/.U8o079 

2  2  2  XST  AR  l  1  i  1  =  10.O:>9093 

2  2  3  XS1A^(12)=  19.488113 


X  3  T  A  k  ( 

4  1  = 

2  .  O  l  060 

XS  1  r>  ( 

3) 

.  7  8  2  l  J  j 

Ail  A  h  ( 

o  1  - 

X  ‘  i  f  «  R  ( 

71  - 

't  .  /I  Vt 

X  S  I  <>  !■  1 

(i  1  - 

't  .  J  7  j  •  5  t 

X  j  (  Ak  i 

'i  1 

j  .  04 1 2 '  t 

XjUkII 

-  /  - 

2  .  a  o  2  -j  4  i 

X  S  T  AR  u  i  1  = 

/ . 92  )  J 

A  S  1  A  K  (  [ 

2  1  = 

/  .  9  0  8  2  0  •♦ 

1  IS  UiSSfcRVtD  TAbLE  A NO  X  IS  INITIAL  GIST. 

21  IXSTAR : X J  =  184.  OjaOoO 

2  I ( 2 i XS  T  AR  ) =  7.89 58  74 

T  All  1 1  1  =  0.434  372 
TAl'I  2  1  =-  l  .  364242 
TAU<  3 1990 
TAUl4)=-0. 233695 
TAUIp 1  =- U  •  U  / 1004 
T  A  U  (  o  1  =  0. ‘♦6  8228 
T  AU(  7  )  -  0  .0  1  ‘♦.>25 


ESTIMATE  GE  X  AT  CUUNT=  0 

1  1  1  XST AP l  11  =  59.  1 21 3b b 

1  1  2  XS  T  AR (  21=  4S.S43640 

1  1  3  XST  AR I  3 )  =  3 4. 7c8190 

1  *  1  XSTAKI  4)  =  1  9. 9  76489 

1  2  2  XSTARl  o)=  14.047031 

1  2  3  XSTARl  61=  17.976317 

2  1  l  XS1ARI  71=  74.960704 

2  l  2  XSTARl  «)=  56.0o2469 

213  XSTARl  91=  44.9687o5 

2  2  1  XSTARl  10)=  17.832737 

2  2  2  XST  AR (  1 1 1 =  19.294340 

2  2  3  XST  AR (  12 ) =  20.0527O8 


LN_XST  AR (  11  = 
LN_  X  S  T  AR I  2 )  = 
LN_X  S  I  AR  I  31  = 
L N_X S T  AR  I  41  = 
L  N_ a  STAR!  j i  - 
LN_  XS I AR (  o)  = 
LN_XSTAk(  71= 
L\_XSTAR(  o  1  = 
LN_  X  S  T  AR.  (  91  = 
L  N_  X  ST  AR  1101  = 
LN_XST  Aa ( 1 U  = 
LN..XST  Ak  (  id  = 


4 .069790 
3.818671 
3.34  7332 
2.39373b 
2 . 2  4  11 
2 . ooOOOO 
4.317071 
•i.Ool  320 
j. 803908 
2 . 8  82 1 30 
2.989822 
3 . 0  3  74  tio 


Z  IS  GhSthVU)  TABLE  A.  10  X  IS  INITIAL  GIST. 
21  ( XS  TAR : X 1  =  141.138319 
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3  t-ACTOk  TA8Lf:I*J*K 


DARRAT 


^  _  Oh  b I  ON  Mm  T  R  IX 

1 

2 

3  4 

O 

7 

8  9 

10 

11 

1 

1 

1 

1  i 

1 

l 

1 

1  i 

1 

1 

2 

0 

0 

0  0 

0 

0 

1 

l  1 

1 

1 

3 

0 

0 

u  1 

i 

1 

0 

u  0 

L 

1 

4 

0 

1 

2  0 

1 

2 

0 

1  2 

0 

1 

5 

0 

0 

0  0 

0 

0 

u 

0  0 

1 

1 

6 

0 

0 

0  0 

6 

0 

0 

1  2 

0 

1 

7 

0 

0 

J  0 

1 

2 

0 

0  c 

3 

1 

fl 

0 

0 

J  0 

0 

0 

0 

0  0 

0 

1 

OoShh vtO  VALUES 

1 

1 

1 

X  ( 

1)  = 

58. JOOOOj 

L’M_X( 

1 )  = 

4 ,0o0443 

1 

1 

2 

X( 

2  )  = 

49.0 OOJOO 

LN_X  ( 

2  )  = 

3 .691820 

1 

1 

5 

J 

x( 

3)  = 

33. J30000 

LN_X( 

3)  = 

3 . 496508 

1 

2 

1 

XI 

4)  = 

11 .000300 

LN_X  ( 

4 )  = 

2  .3978  35 

1 

2 

2 

XI 

5 )  = 

14. OOOOOu 

LM_X  ( 

5  )  = 

c. 6 3 905/ 

1 

2 

3 

XI 

6)  = 

18. 0000  JO 

LN_X  ( 

6>  = 

89037 1 

2 

1 

1 

XI 

7  )  = 

1-j  .  UOOOOu 

LN_X  I 

7)  = 

4.31 748b 

2 

1 

2 

x( 

d)  = 

58. OOGO  >0 

LN_X  ( 

6)  = 

4  .  G  o  0  4  4  3 

2 

1 

3 

XI 

9)  = 

4b.  0)00  JO 

LN_X.I 

9  )  = 

3.806663 

2 

2 

1 

XI 10) = 

19. UUOOOO 

LN_X I  10)  = 

2.944439 

2 

2 

2 

X(  ll)  = 

17.000030 

LN_X  I  11)  = 

2  .  L  >  j/  1  3 

• 

2 

2 

3 

X( 12)= 

22  .  OOOOUO 

LN_X {  12  )  = 

3 .09  1  cm  3 

COiNiTkAINTS 


NTHtTAI  11  = 
N  T  He  T  A  (  2  i  = 
i-JThE  T  A I  3  )  = 
NThel A(4)  = 
NT  Ht  T  A ( 5 )  = 
NT hET  A ( o  )  = 
tTHcTAI  7)  = 
NTFbTA(d)= 


■314.0)0000 
2  j  o .  OuOUOO 
101.000000 
374.U0OUOU 
5  0 . 000000 
2  09  .  JO  )U00 

111 .ooouoo 

61 .00 Ov CO 


I  Mb  INITIAL  DISTRIBUTION  IS  UNIFORM 

ITERATIONS—  63 


-.STMmT&O  DISH  I  OUT  I  ON 


1 

l 

1 

XSTAk ( 

i  )  = 

55 . 653397 

L  N...  X  5  T  A  R  l 

1)  = 

4.608531 

l 

1 

c 

aST  Ak  I 

2  )  = 

•t 9 . 5  4429c 

L  N_  X  S  T  A  r  ( 

2  )  = 

_>  .  6  i  Ocou 

1 

1 

3 

XSTAk ( 

3)  = 

34. 7  7224  / 

L  N_XST  Ak i 

5)  = 

3 . 54  882  ) 

1 

c 

1 

XSTAk ( 

4  )  = 

11 . 006032 

LN..  XS  TAR  I 

4  )  = 

2.  jvtiuc  t 

1 

2 

2 

XSTAk  ( 

5)  = 

14. 0  :>  6  2  4  9 

L  N_  X  S  T  A  k  l 

5  f  = 

2  •  u  -+  3  2  6  6 

1 

2 

3 

XSTAk ( 

6 )  - 

17.963644 

LN_XSTAr;  l 

-o)  = 

<_  .  6  c  7  7  5 

2 

1 

1 

a  s  r  a  r  ( 

7)  = 

74. 360 144 

LN_XST  AR ( 

7)  = 

4  ,  i  7  3  39 
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2 

1 

2 

AbTAkl  f.  1 

JX 

•Jd.O<.  i]  b  / 

LN  X  S T  A i< 

V  M) 

<i 

.  J., 

1  ••  j. 

2 

1 

3 

a  s  r  a  p.  i  s  i  - 

-i  *•  •  4  3  82  <J  U 

Lk.  _  X  8  T  A  in 

l  J) 

.1 

.  o'  i 

’  ■  7  j  ) 

2 

l 

ASTAftllUlr 

1 7. o  3  U  7  0  d 

L  X  S  T  A 

(  2  0 I 

L. 

•  NJO 

.■  '^3 

» 

<- 

2 

2 

aSTA„ ( ll)= 

19.294340 

Li\_aSTAk  (ii)^ 

2 

.43 

7  l »  X.  2 

2 

2 

3 

/STAR (121= 

20.  cl  3  3  1  u  9 

L  l\ _ AST  Ah* 

U2)  = 

3 

.03 

1  (  -Ji 

fc  5  T  1  Ri  A  T  1  . 0 

UU\ 51 kA 1  NT  S 

N  T  li  A  T  4  1)- 

418.  < L.  4  7  3  6 

NT  HAT  (21  = 

25  o  .  >093 49 

Ml HaT <  _>)  = 

101.020323 

ISIl  HAT  l*» )  = 

.»  /•*  .O  Jl  3  74 

NT  HAT  <  b  J  = 

5  d  .  )  C  C  4  1 2 

NT  HAT  (o)  = 

2  08  .4  6-i  4o7 

Dirt  AT (  /  )  = 

110.9/0398 

NTHaT  1 8  I  = 

6  1.  J04868 

OUTLIER  STATISTIC 
L  IS  OBSERVED  TABIC  AMD  X 
1  1  1  uUTLIEkAt  li 

1  1  2  UUTLI  ERX  (  2) 

L  l  3  LUTLIEKXI  3) 

1  2  1  UUlLlCRXI  4) 

1  2  2  UUTLIEKaI  5) 

123  QUTLIERXI  o) 

c  1  I  OoTLIEkXl  7) 

2  12  OUTL l  Ek  X (  8) 

‘213  OUT  L 1 ERX (  9  ) 

2  2  1  OUT  L  1  Ek  A  (  1  0  I 

222  OUTLlERX(ll) 

2  2  3  OUT  L  I  Ek  X  (  1 2  ) 


IS  iNl  VIAL  UiSTkldUTIf.M 
=  13.24  do 81 
=  '2.823442 
=  0. 00059 8 

=  27.50664d 
=  1 8 . y  7  6028 
=  1 1 . 2d  32  84 
=  30.629203  _ 

=  ~i  1.771381 
=  2 . 5 j  8200 

*'"11.449866 
=  9.266115 

=  7.2^6758 


1  1  1 

1  i  2 

l  l  3 

1  2  1 

l  2  2 

1  2  3 

*2  1  1 

2  l  2 

2  13 

2  2  1 

2  2  2 

2  .2  3, 


UUTl.  1  ERZ  (  1  )  =  0 . 04o474 

OUT 1 1 E  R  Z (  21=  0.252732 
UUTL I ERZ l  3 1 =  0.042710 
CJUTLIERZl  4  I  =  O.UOOUUo 
UUTL 1 EkZ (  5 {=  J. 000242 
UUTL I ERZ I  61=  0.000119 
UUTL  IEP  2  (  '71=  0.000  JO  z' 
UUTLIERZI  d )  =  _0 • JC0069 
UUTL IcRZl  9 )  =  0.0UO039 
GUTLIEkZIIOI =  U. 071710 
OUTL I fc  R  Z ( 1 1  I =  0.290512 
(JUT  L  I E  kX  ( 1_2  )  =  __0  .  Uo  1 18  1 


2I(XSTAR:X)=  141.023143 
2l(Z:XSTAkl=  d. 813237 

TO  L  l  =  O.UlOuOO  T012=  U.OIOOOO 
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Example  6.  Four  point  bioassay  -  fit  of  logistic  function. 

This  example  illustrates  the  application  of  the  k-sample 
procedure  to  fitting  data  based  on  restraints  using  the  observed 
values.  The  procedure  was  also  used  on  the  data  of  examples  1 
and  2  of  chapter  4,  with  results  the  same  as  there  given.  It 
has  also  been  applied  in  a  number  of  other  cases,  not  given  here 
as  additional  examples. 
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ble 


We  reformulate  the  data  first  as  the  4x2  contingency  ta- 
2,  with  entries  x(ij),  i=l,...4,  j  =  1,2 


j  =  1 

j  =  2 

Deaths 

Alive 

1 

9 

6 

4 

3 

7 

8 

2 

18 

22 

Table 

2 

10 

10 

10 

10 

40 


The  log-linear  diagram  of  the  representation  of  the  minimum 


discrimination  information  estimate  is  shown  in  Fig.  1. 


55 


For  the  procedure  fitting  observed  me  ginals  or  other  re¬ 
straints,  we  note  that  Fig.  1  implies  the  following  relations. 


X*(il)  +  x*(i2)  =  x (il)  +  x(i2) ,  i  =  1,2, 3, 4  , 

x*(ll)  +  x*  (21)  +  x*  ( 31)  +  x*  ( 41)  =  x(ll)  +  x  (21)  +  x(31)  +  x(41), 
x* ( 21)  +  2x*  (31)  +  3x* (41)  =  x (21)  +  2x(31)  +  3x(41)  , 


For  the  k-sample  algorithm  this  is  a  case  of  4  samples,  two 
observations  per  sample.  The  basic  B  matrix  is  given  in  Fig.  2. 


11  12  21  22  31  32  41  42 
iol2345678 
11000000 
00110000 
00001100 
00000011 
10  10  10  10 
00102030 

Figure  2 
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In  view  of  the  relations  given  above  between  values  of  the 
x*'s  and  the  x's,  in  this  case  the  C  matrix  is  derived  from  the 
B  matrix  by  the  relations 


where  B  is  6  x  8,  B^  is  4  x  8 ,  B_2  is  2  x  8  with  similar  dimen¬ 
sions  for  the  C  matrix  and  its  components. 

We  remark  that  instead  of  starting  the  iteration  from  the  uni¬ 
form  distribution,  t'uu  ini'.ial  distribution  .used  in  the  compu¬ 
ter  output  attached  was  x1*(ij)  =  x(i.)x(,j)/N  as  calculated  from 
table  2.  Ve  comment  that  another  run  using  the  uniform  distribu¬ 
tion  NTr(ij)  =  5  as  the  initial  distribution  for  the  iteration 
yielded  the  same  final  values .  The  computer  input  data  is  given 
in  table  5. 

By  computing  the  maximum  likelihood  estimates  of  o  and  6  in 
his  formulation,  Berkson  derived  the  estimates  given  in  table  3. 

Berksons-Estimate  (Max.  Likelihood) 

Deaths  Alive 

1.901431  8.098569  10.000000 

3.445099  6.554901  10.000000 

5.405505  4.594495  10.000000 

7.247965 _ 2.752035  10.000000 

18.000000  22.000000  40.000000 

Table  3 
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TS 


Berkson  gave  a  value  2I(x;x*)  =  5.985432,  2  D.F.  (on  p.  447 
of  Berkson  (19  72)  the  degrees  of  freedom  are  incorrectly  given  as  1)  . 


♦ 


The  m.d.i.  estimates  after  4  iterations  are  given  in  table  4. 


M.D.I  Estimate — 4  iterations 

i 

Deaths 

Alive 

1.901434 

8.098566 

10.000000 

i 

3.445101 

6.554895 

9.999996 

! 

5.405508 

4.594491 

9.999999 

i 

7.247968 

2. 752036 

10.000004 

/ 

18.000011 

21.999988 

39.999999 

1 

1 

Table  4 

1 

2I(x:x*)  = 

5.985401,  2  D.F. 

.  t 

We  also  have  the  analysis  of  information 


Analysis  of  Information 

Component  due  to _ Information _ _ D.F. 

x(i.),  x(.j)  2I(x:x^*)  =  12.863  3 

x(.  j)  ,  x(21)+2x(31)+3x(41)  ,x(i.)  21  (x*  ix^)  =6 . 878  1 

21  (x:x*)=5. 985  2 


P 


From  the  output  and  Fig.  1  we  see  that  since  xj  was  the  ini¬ 


tial  distribution 

„_x*(l)  =  £nxl*(1)  +  T,  or  -1.449079  =  -0.200671  -  1.248407  . 

V(7)  v<a> 
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Jin 

x* 

13) 

+  T1  +  T  2 

x* 

(4) 

x1*(4) 

dn 

X* 

(5)  = 

=  in 

x1*(5) 

+  Ti  +  2x^ 

X* 

(6) 

xx*(6J 

dn 

X* 

12)  * 

=  Jin 

xx*(7) 

+  Xi  +  3t  2 

X* 

(8) 

x1*(8) 

or 

dn 

X* 

(3)  - 

-  Jin 

x*(l)  = 

=  0.805820 

X* 

(4) 

x*  (2) 

dn 

X* 

11)  ' 

-  dn 

x*  (3)  = 

=  0.805819 

X* 

(6) 

x*  (4) 

in 

X* 

12)  - 

-  Jin 

x*  (5)  = 

=  0.805820 

x*  (8)  x*  (6) 

Note  that  X2  =  6.545451  = 


or  -0.643259  =  -0.200671 
-1.248407  +  0.805820  , 

or  0.162560  =  -0.200671 
-1.248407  +  1.611640  , 

or  0.968380  =  -0.200671 
-1.248407  +  2.417460  , 


(9)  2  ( .  080808)  ,  that  is,  the  quad¬ 


ratic  approximation  to  2I(x*:x  *)  also  obtainable  as 

2l(x*:x  *)  =  E(x*(ij)-X1*(ij))  2=  10  (  (2.599) 2+ (1.055)  2  + 

x1*(ij)  (4.5)  (5.5)  (>906)  2  +  ( 2 . 748)  2)  = 

0.40404(16.2401)  =  6.562 
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Computer  Input 

JOB  CARD 
EX  PROGRAM 

TITLE  =  'LOGISTIC  FIT  BERKS  ON  "S  MDI ' 
UNIF  =  'O'B 
NUMSET  =  4 
BMAT  =  '  1 '  B 

TOLl  =  .001  TOL2  =  .001 
CNSTRNT  =  6  OBS  -  8  ; 

2  2  2  2 

11000000 

00110000 

00001100 

00000011 

10101010 

00102030 

19643782 

4.5  5.5  4.5  5.5  4.5  5.5  4.5  5.5 

Table  5 
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L  i'L  I  .>  1  J  (  Ml  t  1  kKm  i  '  S  'Ll 


il  MA  i  .  i  > 


W F  i GH 7(11=  G. 2bOOUU 
*tlGHT(2>=  C  ,2‘j  OOOO 
Wf- 1  GH  1(3)=  0.2g0UOU 
nt  IGl-  T  l  4  )  =  C  .  2  b CO 00 


I N  V_  k>  E  1 1>  T  (  1 1  = 
INV_*tll>T(2)  = 
1  NV_  » l  l  G  E  T  (  J  J  = 


4 .  UOOGUu 
4.00  JO GO 
4. 00 j JUJ 
4 . OQUOOo 


C  DtSlGN  MTklX 


OBStRVFC  VALUES 
a  1 1 ) =  l.cCOOOO 
X  (  2 )  =  S.GCOOUC 
X  1  3  J  —  t.CCCOoC 
X  l  4 1  =  4.GLJ000 
X  (  b )  =  G.CCOUOO 
X  (  6  )  =  7.CGOOOJ 
X  (  7  1  =  e.ccoooo 

X ( 8 ) =  ^ . C  OOUOO 


L  N_X  (  1 )  = 
LN_X ( 2 ) = 
l N_X ( J) = 
L  N_X ( 1 1  = 
l  tv_  X  I  -j )  = 
LN_X ( o ) = 
LN_X (  /  )  = 
L  N_  X ( 0  )  = 


G.CCGGCO 
2.10722b 
1.  /SI  /bo 
1  .  3  b  c  2  9  *♦ 
l.OSbfcU 
1 .94b5lG 
2 . 0  7  9  4  *♦  1 
0.fcs3l>7 


C  CNS 1  POINTS 
J  THE  T  A  ID*  < 
NTHE  r A ( 2 ) =  ' 
NIHETA ( J) =  < 
NTHtTA (41=  t 
N  THE  T  A  (  5  )  =  1 


40.000000 
40. JOOOOO 
4  0  *  0  G  0  0  0  0 
40 . JOOOOO 
1 O . UUUOOU 


343 


N  1  >  i  (  1  A  (  o  )  -  _)m.wCjOO> 


61 


>i  t  T  l  n  i  I  STi'  if  i  I  low 
x  i  r/\K  it  n  =  >*.  j  jjouo 
XL  r/w  1  <  2  )  =  b.  t>Ou<iCG 
aS  U-’ 1(3)-  t.VJvjOC 
K  S  T  A  !<  1  (4)-  .>  •  b  0  d  U  «J  U 

X  S  I  />  f  1  (  5  )  -  h.  50 CL  CO 
A^TMlio)-  b.bdvuOO 
X  S  1  A  4  l  (7)-  4.  J<j  JdOG 

Asr/,-ii(ai=  b.  t»  *»  c,f'  oo 


t  |\  ASTAmTU  !=  1 . 5  u  •»  U  7  7 
L  M  XS1  A»  T  (21  =  1.  7  24  74t 
L  j  X  S  T  A  r  T  ( .5 )  =  1 .  b  u  4  C  7  7 
l  M  XSTAkT  (** )  =  1 .  70 4  74 ti 
L  -si  XSIAH  (b)  =  1 .  b  1 4  C  7  7 
l  M_.  XS  T ->*■'  T  (  b  1  =  i  •  7  J •*  7 4c? 
LU..XSTAI-T(7)»  1.504077 
IN.  ASTAhT  (il)=  i./0i74S 


I  S  T  I  N  A  T  E  l>  N  T  | 1  f  (  A  A1  CouU  =  L 

0 1  H A  I  (  i  )  =  40.0  JOUOO 

Nj  THAI  ( 2  )  -  h  0 . 0  0  J  u  0  0 

N  I  HAT  (  2  )  =  40.  O'.  Jr  JO 

N1mAT(4)-  4(J.UJJuOO 

\THA1  (b)  =  lb.bOOoOO 

MTHA I  ( o  )  -  2  7. JOJUQO 


1 

2 

1 

•t 

5 

b 

1 

LOO 

d 

0 

0 

ia 

w 

c 

'J 

lot) 

0 

0 

la 

ia 

) 

0 

0 

IbO 

0 

id 

lb 

'» 

u 

0 

0 

ICO 

1.3 

54 

1  ri 

i  a 

Id 

lb 

la 

27 

t 

j 

la 

3o 

bt 

2  7 

b3 

>22.1 


1 


1 

0  .400002 
14. «b00O4 


2 

i<» .  rtb0004 
3h .650009 


S22.  1_  I  N  V 


l 


2 


1 


0.2H2b2d  -0.i212i<- 

-0.  12  12  12  0.  . dUb'j tt 
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UbLT  A  l  l  I  =  0  .  J o U  J ■-» U 
DtL  T  A  ( <.  )  -  G.JJJJOO 
DLLT4  (  3  >-  O.OUOJoO 
Dfcl.  TA14}  =  O.OuJj  0  ) 
'fcLTA(M  =  0.000  JuO 
~DfcLTA(t)=  S.JuOJud 


XSO  =  (.  .  bn  9  4  b  i 

FSTIPCIf  LF  a  aT  i.bUM 
XS  TAM  l  )  -  2. Mb  85L> 
XSTAF  (2)=  7.b44l42 
X  S  T  AP  (  ^  )  -  J.t,2bbii 
XSTAk  (4)  =  b.j  74nbl 
XbTAPCrM  5  .  n  3  6  b  l  3 
X  $  T  A  P  ( t  1  =  4  .  b  4  3  4  b  H 
XSTAM  7  »  =  7. Ob 4 .194 
XSTAk  ( P  )  -  2.  '■*  l  JbOt! 


1 

L  N_._  Xo7A-lii=  0.  /Lo  i(ic 
L  l\_-X  i»  I  4b  U  )  -  2.vO'WtW 
L  i  v .  X  S  1  A  •’  ( 3  )  =  i.  •*.<'/  9  9  b 
L  a'  Xb  Mi-  (  4  1  -  1  .  •  ■  92  10  J 

l  N  _XS  7  \M  J  )  i  ,ol)  lb  J4 
Lk_XS  I  At-  (  o  »  1.7<:hl 

L N_. X S  I  A {/!-•  l.'jioSbv 
l  W..XSTAM  c  )  -•  i.vOfiJM 


Z  IS  (.  1  Si  P  V I  t »  TAbLF  A  X  IS  INITIAL  LIST. 
2I(XSTAP:>)= 

M(Z:>S7AKJ=  c.0bl37l 

T  AU ( l ) =- 1 .09  J9Un 
TAU(2I  =  0.  72/2  7  2 


Fsnpni  LF  a  AT  C  U  b  N  7  =  4 


XST  ARID-  1.O01434 
XST  AP  (c  )=  b. 0  9b boo 
XSTAF  (21=  3.44blLi 
XST  AP  (4  )  =  6 .  o 04  ?0 r> 
XST  AP  (  L  )  =  b  .  40MUC 
XST  AF  lb ) =  4 . b4444  1 
X  b  T  AP  <  7  )  =  7 . 2  4  7  9c  b 
XST AR ( b 1 =  2.  7b^03b 


L  N_  X  S  T  A  P  (  i  )  =  0  *  c  h  2  b  0  0 
L  N_  A  S  T  A  P  (  2  1  =  2.0'7ioo7^ 
L  im_  X  b  T  A  F  (  j  >  -  1  •  c  0  6 '»  j  3 
l  N_  XST  AP  ( 4  1  =  1  •  o  S  J  2  l 
Li\„XSTAP(3)-  i.od/nlc 
LN_A SI  AK  (  c>  1  =  1.S^4o‘jc 
l  (m_XS  f  AM7  )  =  l  ,9b0  72  l 
L  M_  X  i  T  A  F  (  o  )  -  1  .  0  1 2  3  «  1 


'"l  IS  Lb r  L  P  Vt.;  T  Ac  Lt  mNO  a  IS  INITIAL  LISl. 
2l(XSlAP:X)=  l  .  8  7  a  ‘i  2  2 

345 


63 


21(Z:>'1AGJ=  3  .  S  b  b  4  j  L 

TAUll  —  1  .24o4U7 
T  AU  (21-  L  •  d C  ->  tl 2 0 


I'jTIHT'  cr  MHtTA  AT  U.UM  =  s 

N t h A l  (i  i  =  os.sssbos 

N  TH A  T  (  C  )  =  tu . UUc. uU 

NTHAT(.M=  4C.CCJ'-\>0 

imThAT  14  )  =  iS.99SSbb 

^ ThAT 16  )-  lb. 00 Jo 31 

MThATTt.  )=  3<).CJJ«J00 


S22. 1 


1  d.u'/tjtl  lj.2US.jtl 

2  1 3 .  tOS 3o  1  30.l44b01 


S22.1. INV 


1 

1  0 . 4  0 l S  2  S  -U.l/ol2t 

c  -0.1761 2o  J. 110302 


i.tLIA(l)-  .000031 
i )  h l T  A (21=  O.OGOOOO 
! > C L 1  »’•  1  j  )  -  o.OtUOoU 


DU  1/(41-  0 .00  101b 
UtLU  tb  )=-U.0(J  Ju3  1 
DLL  T  A  ( <_  {  =  0 .  Jl’i  '  jO 


OUT L I r  K  |  1)  =  2  .2 Jtibbb 
UUTLirHiU  l.r,b4Lc 
OUT l  11 M  3  )  =  0.^  HI 7HV 


Reproduced  from 
best  available  copy. 
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0 1<  I  L  I  l  h  (  4  I  -  ■  >  •  1  o  l»  u  '>  L 
U  l '  T  l.  1  f  K  ( *.  < )  =  i .  1  <  i  u  0  i  o 
OLll!IHtl=  •) .  lo/ 

1  111  T  L  1(>  (  /  )  a  l.jilv/vs 
ilTLllMul-  1.9JW10 


!TbMTICN5= 


T  L  L l  =  l . uU  10 


T  LI ^  =  0. 00 1U 
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